Compare commits

...

87 Commits

Author SHA1 Message Date
shunf4
ea290d53ad fix: EDITCHANGE should not append; SELCHANGE should delay (2) 2024-09-03 01:22:49 +08:00
shunf4
e2770ed02a fix: EDITCHANGE should not append; SELCHANGE should delay 2024-09-03 01:15:38 +08:00
shunf4
abb4a992d9 feat: append filename also on combo select 2024-09-03 00:50:16 +08:00
shunf4
2dbb212482 fix: ci; change document, 7-zip 23 is not latest now 2024-09-03 00:36:12 +08:00
shunf4
dee3259e58 copy history excludes appended filename path segment; always check open output folder, unless manually changing registry 2024-09-03 00:21:26 +08:00
shunf4
fb962a8070 feat: big dialog font 2024-05-14 11:36:59 +08:00
shunf4
eaf6d6d0c6 doc: update doc 2024-05-14 09:55:56 +08:00
Tino Reichardt
b7e2b5ca60 (shunf4 cherry-pick zstd 5a0006bf) Fix #76 - thanks go to @Liz-chan for pointing out ;) 2024-05-14 09:40:40 +08:00
shunf4
f941b40919 update: update docs 2024-05-14 09:38:08 +08:00
shunf4
cd0053957b docs: update docs 2024-05-13 22:37:05 +08:00
shunf4
c79f6c7f34 ci: add branches to trigger 2024-05-13 22:23:29 +08:00
Tino Reichardt
e19abb2958 shunf4 cherry-picking hash related commits from zstd. The following commits from 7-zip-zstd repository (https://github.com/mcmilk/7-Zip-zstd) is picked:
commit add56b5aed
Author: Tino Reichardt <milky-7zip@mcmilk.de>
Date:   Thu Nov 1 23:08:00 2018 +0100

    Add MD5 hash function

commit 36a17a5184
Author: Tino Reichardt <milky-7zip@mcmilk.de>
Date:   Sat Nov 3 00:18:33 2018 +0100

    Add some hash functions
    - new: md2, md4, md5, sha384, sha512, xxhash-32, xxhash-64
    - put Blake2sp hash stuff back to rar code
    - added the hashes to GUI and Explorer Menu code

commit 576c5df947
Author: Tino Reichardt <milky-7zip@mcmilk.de>
Date:   Tue Apr 6 19:35:46 2021 +0200

    Add BLAKE3 hash function

commit 6b2a151549
Author: Tino Reichardt <milky-7zip@mcmilk.de>
Date:   Tue Apr 6 19:51:01 2021 +0200

    Remove unneeded file HashesReg.cpp

commit dddf507557
Author: Tino Reichardt <milky-7zip@mcmilk.de>
Date:   Sun Jun 18 09:13:59 2023 +0200

    Add SHA3 hashing

    - added these variants: SHA3-256, SHA3-384, SHA3-512
    - reordered also the hashing id's
    - added some notes about them in DOC/Hashes.txt

    Signed-off-by: Tino Reichardt <milky-7zip@mcmilk.de>

The cherry-picking was a chaos; they're not applied in order, and some
commits even got cherry-picked twice (1->4->0->2->4->3). So subsequent fixes and
adjustments were applied to make it build.
2024-05-13 22:20:40 +08:00
shunf4
cd0993fe9c feat: go on cancel folder priority over file in comparison 2024-05-11 16:06:57 +08:00
shunf4
6b5da20fb6 feat: opens sole folder instead of upper folder after extraction; cancel folder priority over file in comparison; other minor ui fix 2024-05-11 15:59:23 +08:00
shunf4
7c6d4e7757 feat: 1. drag to panel address combobox opens the archive/dir; 2. mouse forward/backward key nav 2024-05-11 00:29:54 +08:00
shunf4
ceda12136d fix: SoleFolderIndex: continue fixing method decl/impl 2024-05-10 19:50:32 +08:00
shunf4
b0fd5cfa48 fix: SoleFolderIndex: continue fixing method decl/impl 2024-05-10 19:45:55 +08:00
shunf4
e76e0f5d57 fix: SoleFolderIndex: continue fixing method decl/impl 2024-05-10 19:40:20 +08:00
shunf4
5be705687e fix: remove soleFolderIndex arg in IFolderOperations::CopyTo 2024-05-10 19:29:11 +08:00
shunf4
ffffba9e20 fix: compile error, using another way to pass down SoleFolderIndex 2024-05-10 19:22:11 +08:00
shunf4
fa9ded58f1 fix: missing soleFolderIndex arg in IFolder::CopyTo 2024-05-10 16:15:29 +08:00
shunf4
689d50eb7f fix: missing SoleFolderIndex field in CPP/7zip/UI/Common/ArchiveExtractCallback.h 2024-05-10 16:10:12 +08:00
shunf4
76db359120 feat: do not set time for sole folder in extraction 2024-05-10 15:52:53 +08:00
Shun Zi
fabeab4a9f Update README.md 2024-04-20 22:47:04 +08:00
Shun Zi
e96529572c Update README.md 2024-04-20 22:44:39 +08:00
Shun Zi
4baa50a867 Update README.md 2024-04-20 22:44:16 +08:00
shunf4
b22709156d fix: remove unused kPathHistory 2024-04-20 21:50:40 +08:00
shunf4
9d218e2681 when the archive has only one folder as its direct child, do not add filename to path by default on extraction; if not, add filename to path 2024-04-20 21:16:28 +08:00
shunf4
4b7e9f0800 setup ci 2024-04-20 21:16:23 +08:00
shunf4
33fc299a36 make changes to about dialog 2024-04-20 21:16:02 +08:00
shunf4
b085993aae apply James Hoo 's other original mod 2024-04-20 21:14:57 +08:00
shunf4
d5255dec84 make it build after mod 2024-04-20 12:48:57 +08:00
glachancecmaisonneuve
18725aeba6 easy 7-zip mod for 23.01: rebased from 19.00 2024-04-20 11:42:14 +08:00
Kornel Lesiński
082657f61b Github info 2023-12-22 17:17:26 +00:00
Igor Pavlov
a36c48cece 23.01 2023-12-22 17:17:05 +00:00
Igor Pavlov
ec44a8a070 22.00 2022-06-23 11:43:16 +01:00
Igor Pavlov
c3529a41f5 21.07 2022-01-22 18:43:09 +00:00
Kornel
52eeaf1ad6 Merge pull request #4 from FnControlOption/2106 2021-12-18 11:01:24 +00:00
Igor Pavlov
ccbf6ad3c1 21.06 2021-11-28 19:08:41 -08:00
Igor Pavlov
1194dc9353 21.04 2021-11-28 19:03:01 -08:00
Igor Pavlov
d789d4137d 21.03 2021-11-28 19:01:13 -08:00
Igor Pavlov
585698650f 21.02 2021-07-22 23:00:14 +01:00
Igor Pavlov
4a960640a3 19.00 2019-03-04 01:27:14 +00:00
Igor Pavlov
5b2a99c548 18.06 2018-12-30 14:01:47 +00:00
Igor Pavlov
18dc2b4161 18.05 2018-05-02 22:28:04 +01:00
Igor Pavlov
f19b649c73 18.03 2018-03-12 11:19:46 +00:00
Igor Pavlov
866a06f5a0 18.01 2018-01-30 00:35:06 +00:00
Igor Pavlov
da28077952 18.00 2018-01-11 22:16:32 +01:00
Igor Pavlov
b5dc853b24 17.01 2017-08-29 20:49:43 +01:00
Igor Pavlov
2efa10565a 17.00 2017-05-05 18:56:20 +01:00
Igor Pavlov
603abd5528 16.04 2016-12-08 12:13:50 +00:00
Igor Pavlov
232ce79574 16.03 2016-12-08 12:12:54 +00:00
Igor Pavlov
1eddf527ca 16.02 2016-05-28 00:17:00 +01:00
Igor Pavlov
bec3b479dc 16.01 2016-05-28 00:16:59 +01:00
Igor Pavlov
66ac98bb02 16.00 2016-05-28 00:16:59 +01:00
Igor Pavlov
c20d013055 15.14 2016-05-28 00:16:58 +01:00
Igor Pavlov
9608215ad8 15.13 2016-05-28 00:16:58 +01:00
Igor Pavlov
5de23c1deb 15.12 2016-05-28 00:16:58 +01:00
Igor Pavlov
e24f7fba53 15.11 2016-05-28 00:16:57 +01:00
Igor Pavlov
7c8a265a15 15.10 2016-05-28 00:16:57 +01:00
Igor Pavlov
a663a6deb7 15.09 2016-05-28 00:16:56 +01:00
Igor Pavlov
6543c28020 15.08 2016-05-28 00:16:56 +01:00
Igor Pavlov
f6444c3256 15.07 2016-05-28 00:16:55 +01:00
Igor Pavlov
cba375916f 15.06 2016-05-28 00:16:55 +01:00
Igor Pavlov
54490d51d5 15.05 2016-05-28 00:16:54 +01:00
Igor Pavlov
0713a3ab80 9.38 2016-05-28 00:16:53 +01:00
Igor Pavlov
7e021179cd 9.36 2016-05-28 00:16:53 +01:00
Igor Pavlov
0dc16c691d 9.35 2016-05-28 00:16:53 +01:00
Igor Pavlov
f08f4dcc3c 9.34 2016-05-28 00:16:51 +01:00
Igor Pavlov
83f8ddcc5b 9.22 2016-05-28 00:16:06 +01:00
Igor Pavlov
35596517f2 9.21 2016-05-28 00:16:05 +01:00
Igor Pavlov
de4f8c22fe 9.20 2016-05-28 00:16:05 +01:00
Igor Pavlov
b75af1bba6 9.19 2016-05-28 00:16:04 +01:00
Igor Pavlov
c65230d858 9.18 2016-05-28 00:16:04 +01:00
Igor Pavlov
2eb60a0598 9.17 2016-05-28 00:16:04 +01:00
Igor Pavlov
044e4bb741 9.16 2016-05-28 00:16:03 +01:00
Igor Pavlov
e279500d76 9.15 2016-05-28 00:16:03 +01:00
Igor Pavlov
708873490e 9.14 2016-05-28 00:16:03 +01:00
Igor Pavlov
3dacb5eb8a 9.13 2016-05-28 00:16:03 +01:00
Igor Pavlov
76b173af78 9.12 2016-05-28 00:16:02 +01:00
Igor Pavlov
993daef9cb 9.11 2016-05-28 00:16:02 +01:00
Igor Pavlov
db5eb6d638 9.10 beta 2016-05-28 00:16:02 +01:00
Igor Pavlov
1fbaf0aac5 9.09 beta 2016-05-28 00:16:01 +01:00
Igor Pavlov
2fed872194 9.07 beta 2016-05-28 00:16:01 +01:00
Igor Pavlov
c99f3ebdd6 9.06 beta 2016-05-28 00:16:00 +01:00
Igor Pavlov
829409452d 9.04 beta 2016-05-28 00:15:59 +01:00
Igor Pavlov
8874e4fbc9 4.65 2016-05-28 00:15:59 +01:00
1410 changed files with 285227 additions and 82558 deletions

13
.gitattributes vendored Normal file
View File

@@ -0,0 +1,13 @@
# Set default behavior to automatically normalize line endings.
* text=crlf
# These files are text and should be normalized (Convert crlf => lf)
*.txt text
*.vcproj text
*.cpp text
*.h text
*.def text
*.rc text
*.cmd

42
.github/workflows/build.yaml vendored Normal file
View File

@@ -0,0 +1,42 @@
name: Build and Upload Artifact
on:
push:
branches:
- easy7zip-sf
- e7z-sf-zstd
- e7z-sf-without-zstd
pull_request:
branches:
- easy7zip-sf
- e7z-sf-zstd
- e7z-sf-without-zstd
jobs:
build:
runs-on: windows-latest
steps:
- name: Checkout repository
uses: actions/checkout@v2
- name: Build project
shell: cmd
run: ${{ '"C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build\vcvars64.bat" && build.cmd' }}
- name: Download and unpack Lang
shell: cmd
run: ${{ 'cd out && curl -L -o official.exe.7z "https://sourceforge.net/projects/sevenzip/files/latest/download" && .\7z x official.exe.7z Lang/ License.txt && DEL official.exe.7z' }}
- name: Pack
shell: cmd
run: ${{ 'cd out && COPY 7zipUninstall.exe Uninstall.exe && .\7z a -m0=LZMA -mx=9 install.7z Lang/ 7z.dll 7z.exe 7z.sfx 7zCon.sfx 7zFM.exe 7zG.exe 7-zip.dll Uninstall.exe License.txt && RENAME 7zipInstall.exe 7zipInstall.exe.bak && COPY /Y /B 7zipInstall.exe.bak + install.7z 7zipInstall.exe && DEL Uninstall.exe && DEL install.7z' }}
- name: Upload artifact
uses: actions/upload-artifact@v2
with:
name: Easy_7zip_Artifact
# Exclude directories
path: |
./out/*
!./out/*/*

7
.gitignore vendored Normal file
View File

@@ -0,0 +1,7 @@
*.o
errorfile.txt
*.user
*.obj
out/
*.vcxproj
*.db

100
Asm/arm/7zCrcOpt.asm Executable file
View File

@@ -0,0 +1,100 @@
CODE32
EXPORT |CrcUpdateT4@16|
AREA |.text|, CODE, ARM
MACRO
CRC32_STEP_1
ldrb r4, [r1], #1
subs r2, r2, #1
eor r4, r4, r0
and r4, r4, #0xFF
ldr r4, [r3, +r4, lsl #2]
eor r0, r4, r0, lsr #8
MEND
MACRO
CRC32_STEP_4 $STREAM_WORD
eor r7, r7, r8
eor r7, r7, r9
eor r0, r0, r7
eor r0, r0, $STREAM_WORD
ldr $STREAM_WORD, [r1], #4
and r7, r0, #0xFF
and r8, r0, #0xFF00
and r9, r0, #0xFF0000
and r0, r0, #0xFF000000
ldr r7, [r6, +r7, lsl #2]
ldr r8, [r5, +r8, lsr #6]
ldr r9, [r4, +r9, lsr #14]
ldr r0, [r3, +r0, lsr #22]
MEND
|CrcUpdateT4@16| PROC
stmdb sp!, {r4-r11, lr}
cmp r2, #0
beq |$fin|
|$v1|
tst r1, #7
beq |$v2|
CRC32_STEP_1
bne |$v1|
|$v2|
cmp r2, #16
blo |$v3|
ldr r10, [r1], #4
ldr r11, [r1], #4
add r4, r3, #0x400
add r5, r3, #0x800
add r6, r3, #0xC00
mov r7, #0
mov r8, #0
mov r9, #0
sub r2, r2, #16
|$loop|
; pld [r1, #0x40]
CRC32_STEP_4 r10
CRC32_STEP_4 r11
subs r2, r2, #8
bhs |$loop|
sub r1, r1, #8
add r2, r2, #16
eor r7, r7, r8
eor r7, r7, r9
eor r0, r0, r7
|$v3|
cmp r2, #0
beq |$fin|
|$v4|
CRC32_STEP_1
bne |$v4|
|$fin|
ldmia sp!, {r4-r11, pc}
|CrcUpdateT4@16| ENDP
END

181
Asm/arm64/7zAsm.S Executable file
View File

@@ -0,0 +1,181 @@
// 7zAsm.S -- ASM macros for arm64
// 2021-04-25 : Igor Pavlov : Public domain
#define r0 x0
#define r1 x1
#define r2 x2
#define r3 x3
#define r4 x4
#define r5 x5
#define r6 x6
#define r7 x7
#define r8 x8
#define r9 x9
#define r10 x10
#define r11 x11
#define r12 x12
#define r13 x13
#define r14 x14
#define r15 x15
#define r16 x16
#define r17 x17
#define r18 x18
#define r19 x19
#define r20 x20
#define r21 x21
#define r22 x22
#define r23 x23
#define r24 x24
#define r25 x25
#define r26 x26
#define r27 x27
#define r28 x28
#define r29 x29
#define r30 x30
#define REG_ABI_PARAM_0 r0
#define REG_ABI_PARAM_1 r1
#define REG_ABI_PARAM_2 r2
.macro p2_add reg:req, param:req
add \reg, \reg, \param
.endm
.macro p2_sub reg:req, param:req
sub \reg, \reg, \param
.endm
.macro p2_sub_s reg:req, param:req
subs \reg, \reg, \param
.endm
.macro p2_and reg:req, param:req
and \reg, \reg, \param
.endm
.macro xor reg:req, param:req
eor \reg, \reg, \param
.endm
.macro or reg:req, param:req
orr \reg, \reg, \param
.endm
.macro shl reg:req, param:req
lsl \reg, \reg, \param
.endm
.macro shr reg:req, param:req
lsr \reg, \reg, \param
.endm
.macro sar reg:req, param:req
asr \reg, \reg, \param
.endm
.macro p1_neg reg:req
neg \reg, \reg
.endm
.macro dec reg:req
sub \reg, \reg, 1
.endm
.macro dec_s reg:req
subs \reg, \reg, 1
.endm
.macro inc reg:req
add \reg, \reg, 1
.endm
.macro inc_s reg:req
adds \reg, \reg, 1
.endm
.macro imul reg:req, param:req
mul \reg, \reg, \param
.endm
/*
arm64 and arm use reverted c flag after subs/cmp instructions:
arm64-arm : x86
b.lo / b.cc : jb / jc
b.hs / b.cs : jae / jnc
*/
.macro jmp lab:req
b \lab
.endm
.macro je lab:req
b.eq \lab
.endm
.macro jz lab:req
b.eq \lab
.endm
.macro jnz lab:req
b.ne \lab
.endm
.macro jne lab:req
b.ne \lab
.endm
.macro jb lab:req
b.lo \lab
.endm
.macro jbe lab:req
b.ls \lab
.endm
.macro ja lab:req
b.hi \lab
.endm
.macro jae lab:req
b.hs \lab
.endm
.macro cmove dest:req, srcTrue:req
csel \dest, \srcTrue, \dest, eq
.endm
.macro cmovne dest:req, srcTrue:req
csel \dest, \srcTrue, \dest, ne
.endm
.macro cmovs dest:req, srcTrue:req
csel \dest, \srcTrue, \dest, mi
.endm
.macro cmovns dest:req, srcTrue:req
csel \dest, \srcTrue, \dest, pl
.endm
.macro cmovb dest:req, srcTrue:req
csel \dest, \srcTrue, \dest, lo
.endm
.macro cmovae dest:req, srcTrue:req
csel \dest, \srcTrue, \dest, hs
.endm
.macro MY_ALIGN_16 macro
.p2align 4,, (1 << 4) - 1
.endm
.macro MY_ALIGN_32 macro
.p2align 5,, (1 << 5) - 1
.endm
.macro MY_ALIGN_64 macro
.p2align 6,, (1 << 6) - 1
.endm

1487
Asm/arm64/LzmaDecOpt.S Executable file
View File

File diff suppressed because it is too large Load Diff

View File

@@ -1,101 +0,0 @@
.code
CRC1b macro
movzx EDX, BYTE PTR [RSI]
inc RSI
movzx EBX, AL
xor EDX, EBX
shr EAX, 8
xor EAX, [RDI + RDX * 4]
dec R8
endm
align 16
CrcUpdateT8 PROC
push RBX
push RSI
push RDI
push RBP
mov EAX, ECX
mov RSI, RDX
mov RDI, R9
test R8, R8
jz sl_end
sl:
test RSI, 7
jz sl_end
CRC1b
jnz sl
sl_end:
cmp R8, 16
jb crc_end
mov R9, R8
and R8, 7
add R8, 8
sub R9, R8
add R9, RSI
xor EAX, [RSI]
mov EBX, [RSI + 4]
movzx ECX, BL
align 16
main_loop:
mov EDX, [RDI + RCX*4 + 0C00h]
movzx EBP, BH
xor EDX, [RDI + RBP*4 + 0800h]
shr EBX, 16
movzx ECX, BL
xor EDX, [RSI + 8]
xor EDX, [RDI + RCX*4 + 0400h]
movzx ECX, AL
movzx EBP, BH
xor EDX, [RDI + RBP*4 + 0000h]
mov EBX, [RSI + 12]
xor EDX, [RDI + RCX*4 + 01C00h]
movzx EBP, AH
shr EAX, 16
movzx ECX, AL
xor EDX, [RDI + RBP*4 + 01800h]
movzx EBP, AH
mov EAX, [RDI + RCX*4 + 01400h]
add RSI, 8
xor EAX, [RDI + RBP*4 + 01000h]
movzx ECX, BL
xor EAX,EDX
cmp RSI, R9
jne main_loop
xor EAX, [RSI]
crc_end:
test R8, R8
jz fl_end
fl:
CRC1b
jnz fl
fl_end:
pop RBP
pop RDI
pop RSI
pop RBX
ret
CrcUpdateT8 ENDP
end

289
Asm/x86/7zAsm.asm Executable file
View File

@@ -0,0 +1,289 @@
; 7zAsm.asm -- ASM macros
; 2022-05-16 : Igor Pavlov : Public domain
; UASM can require these changes
; OPTION FRAMEPRESERVEFLAGS:ON
; OPTION PROLOGUE:NONE
; OPTION EPILOGUE:NONE
ifdef @wordsize
; @wordsize is defined only in JWASM and ASMC and is not defined in MASM
; @wordsize eq 8 for 64-bit x64
; @wordsize eq 2 for 32-bit x86
if @wordsize eq 8
x64 equ 1
endif
else
ifdef RAX
x64 equ 1
endif
endif
ifdef x64
IS_X64 equ 1
else
IS_X64 equ 0
endif
ifdef ABI_LINUX
IS_LINUX equ 1
else
IS_LINUX equ 0
endif
ifndef x64
; Use ABI_CDECL for x86 (32-bit) only
; if ABI_CDECL is not defined, we use fastcall abi
ifdef ABI_CDECL
IS_CDECL equ 1
else
IS_CDECL equ 0
endif
endif
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
MY_ASM_START macro
ifdef x64
.code
else
.386
.model flat
_TEXT$00 SEGMENT PARA PUBLIC 'CODE'
endif
endm
MY_PROC macro name:req, numParams:req
align 16
proc_numParams = numParams
if (IS_X64 gt 0)
proc_name equ name
elseif (IS_LINUX gt 0)
proc_name equ name
elseif (IS_CDECL gt 0)
proc_name equ @CatStr(_,name)
else
proc_name equ @CatStr(@,name,@, %numParams * 4)
endif
proc_name PROC
endm
MY_ENDP macro
if (IS_X64 gt 0)
ret
elseif (IS_CDECL gt 0)
ret
elseif (proc_numParams LT 3)
ret
else
ret (proc_numParams - 2) * 4
endif
proc_name ENDP
endm
ifdef x64
REG_SIZE equ 8
REG_LOGAR_SIZE equ 3
else
REG_SIZE equ 4
REG_LOGAR_SIZE equ 2
endif
x0 equ EAX
x1 equ ECX
x2 equ EDX
x3 equ EBX
x4 equ ESP
x5 equ EBP
x6 equ ESI
x7 equ EDI
x0_W equ AX
x1_W equ CX
x2_W equ DX
x3_W equ BX
x5_W equ BP
x6_W equ SI
x7_W equ DI
x0_L equ AL
x1_L equ CL
x2_L equ DL
x3_L equ BL
x0_H equ AH
x1_H equ CH
x2_H equ DH
x3_H equ BH
ifdef x64
x5_L equ BPL
x6_L equ SIL
x7_L equ DIL
r0 equ RAX
r1 equ RCX
r2 equ RDX
r3 equ RBX
r4 equ RSP
r5 equ RBP
r6 equ RSI
r7 equ RDI
x8 equ r8d
x9 equ r9d
x10 equ r10d
x11 equ r11d
x12 equ r12d
x13 equ r13d
x14 equ r14d
x15 equ r15d
else
r0 equ x0
r1 equ x1
r2 equ x2
r3 equ x3
r4 equ x4
r5 equ x5
r6 equ x6
r7 equ x7
endif
ifdef x64
ifdef ABI_LINUX
MY_PUSH_2_REGS macro
push r3
push r5
endm
MY_POP_2_REGS macro
pop r5
pop r3
endm
endif
endif
MY_PUSH_4_REGS macro
push r3
push r5
push r6
push r7
endm
MY_POP_4_REGS macro
pop r7
pop r6
pop r5
pop r3
endm
; for fastcall and for WIN-x64
REG_PARAM_0_x equ x1
REG_PARAM_0 equ r1
REG_PARAM_1_x equ x2
REG_PARAM_1 equ r2
ifndef x64
; for x86-fastcall
REG_ABI_PARAM_0_x equ REG_PARAM_0_x
REG_ABI_PARAM_0 equ REG_PARAM_0
REG_ABI_PARAM_1_x equ REG_PARAM_1_x
REG_ABI_PARAM_1 equ REG_PARAM_1
else
; x64
if (IS_LINUX eq 0)
; for WIN-x64:
REG_PARAM_2_x equ x8
REG_PARAM_2 equ r8
REG_PARAM_3 equ r9
REG_ABI_PARAM_0_x equ REG_PARAM_0_x
REG_ABI_PARAM_0 equ REG_PARAM_0
REG_ABI_PARAM_1_x equ REG_PARAM_1_x
REG_ABI_PARAM_1 equ REG_PARAM_1
REG_ABI_PARAM_2_x equ REG_PARAM_2_x
REG_ABI_PARAM_2 equ REG_PARAM_2
REG_ABI_PARAM_3 equ REG_PARAM_3
else
; for LINUX-x64:
REG_LINUX_PARAM_0_x equ x7
REG_LINUX_PARAM_0 equ r7
REG_LINUX_PARAM_1_x equ x6
REG_LINUX_PARAM_1 equ r6
REG_LINUX_PARAM_2 equ r2
REG_LINUX_PARAM_3 equ r1
REG_LINUX_PARAM_4_x equ x8
REG_LINUX_PARAM_4 equ r8
REG_LINUX_PARAM_5 equ r9
REG_ABI_PARAM_0_x equ REG_LINUX_PARAM_0_x
REG_ABI_PARAM_0 equ REG_LINUX_PARAM_0
REG_ABI_PARAM_1_x equ REG_LINUX_PARAM_1_x
REG_ABI_PARAM_1 equ REG_LINUX_PARAM_1
REG_ABI_PARAM_2 equ REG_LINUX_PARAM_2
REG_ABI_PARAM_3 equ REG_LINUX_PARAM_3
REG_ABI_PARAM_4_x equ REG_LINUX_PARAM_4_x
REG_ABI_PARAM_4 equ REG_LINUX_PARAM_4
REG_ABI_PARAM_5 equ REG_LINUX_PARAM_5
MY_ABI_LINUX_TO_WIN_2 macro
mov r2, r6
mov r1, r7
endm
MY_ABI_LINUX_TO_WIN_3 macro
mov r8, r2
mov r2, r6
mov r1, r7
endm
MY_ABI_LINUX_TO_WIN_4 macro
mov r9, r1
mov r8, r2
mov r2, r6
mov r1, r7
endm
endif ; IS_LINUX
MY_PUSH_PRESERVED_ABI_REGS macro
if (IS_LINUX gt 0)
MY_PUSH_2_REGS
else
MY_PUSH_4_REGS
endif
push r12
push r13
push r14
push r15
endm
MY_POP_PRESERVED_ABI_REGS macro
pop r15
pop r14
pop r13
pop r12
if (IS_LINUX gt 0)
MY_POP_2_REGS
else
MY_POP_4_REGS
endif
endm
endif ; x64

180
Asm/x86/7zCrcOpt.asm Executable file
View File

@@ -0,0 +1,180 @@
; 7zCrcOpt.asm -- CRC32 calculation : optimized version
; 2021-02-07 : Igor Pavlov : Public domain
include 7zAsm.asm
MY_ASM_START
rD equ r2
rN equ r7
rT equ r5
ifdef x64
num_VAR equ r8
table_VAR equ r9
else
if (IS_CDECL gt 0)
crc_OFFS equ (REG_SIZE * 5)
data_OFFS equ (REG_SIZE + crc_OFFS)
size_OFFS equ (REG_SIZE + data_OFFS)
else
size_OFFS equ (REG_SIZE * 5)
endif
table_OFFS equ (REG_SIZE + size_OFFS)
num_VAR equ [r4 + size_OFFS]
table_VAR equ [r4 + table_OFFS]
endif
SRCDAT equ rD + rN * 1 + 4 *
CRC macro op:req, dest:req, src:req, t:req
op dest, DWORD PTR [rT + src * 4 + 0400h * t]
endm
CRC_XOR macro dest:req, src:req, t:req
CRC xor, dest, src, t
endm
CRC_MOV macro dest:req, src:req, t:req
CRC mov, dest, src, t
endm
CRC1b macro
movzx x6, BYTE PTR [rD]
inc rD
movzx x3, x0_L
xor x6, x3
shr x0, 8
CRC xor, x0, r6, 0
dec rN
endm
MY_PROLOG macro crc_end:req
ifdef x64
if (IS_LINUX gt 0)
MY_PUSH_2_REGS
mov x0, REG_ABI_PARAM_0_x ; x0 = x7
mov rT, REG_ABI_PARAM_3 ; r5 = r1
mov rN, REG_ABI_PARAM_2 ; r7 = r2
mov rD, REG_ABI_PARAM_1 ; r2 = r6
else
MY_PUSH_4_REGS
mov x0, REG_ABI_PARAM_0_x ; x0 = x1
mov rT, REG_ABI_PARAM_3 ; r5 = r9
mov rN, REG_ABI_PARAM_2 ; r7 = r8
; mov rD, REG_ABI_PARAM_1 ; r2 = r2
endif
else
MY_PUSH_4_REGS
if (IS_CDECL gt 0)
mov x0, [r4 + crc_OFFS]
mov rD, [r4 + data_OFFS]
else
mov x0, REG_ABI_PARAM_0_x
endif
mov rN, num_VAR
mov rT, table_VAR
endif
test rN, rN
jz crc_end
@@:
test rD, 7
jz @F
CRC1b
jnz @B
@@:
cmp rN, 16
jb crc_end
add rN, rD
mov num_VAR, rN
sub rN, 8
and rN, NOT 7
sub rD, rN
xor x0, [SRCDAT 0]
endm
MY_EPILOG macro crc_end:req
xor x0, [SRCDAT 0]
mov rD, rN
mov rN, num_VAR
sub rN, rD
crc_end:
test rN, rN
jz @F
CRC1b
jmp crc_end
@@:
if (IS_X64 gt 0) and (IS_LINUX gt 0)
MY_POP_2_REGS
else
MY_POP_4_REGS
endif
endm
MY_PROC CrcUpdateT8, 4
MY_PROLOG crc_end_8
mov x1, [SRCDAT 1]
align 16
main_loop_8:
mov x6, [SRCDAT 2]
movzx x3, x1_L
CRC_XOR x6, r3, 3
movzx x3, x1_H
CRC_XOR x6, r3, 2
shr x1, 16
movzx x3, x1_L
movzx x1, x1_H
CRC_XOR x6, r3, 1
movzx x3, x0_L
CRC_XOR x6, r1, 0
mov x1, [SRCDAT 3]
CRC_XOR x6, r3, 7
movzx x3, x0_H
shr x0, 16
CRC_XOR x6, r3, 6
movzx x3, x0_L
CRC_XOR x6, r3, 5
movzx x3, x0_H
CRC_MOV x0, r3, 4
xor x0, x6
add rD, 8
jnz main_loop_8
MY_EPILOG crc_end_8
MY_ENDP
MY_PROC CrcUpdateT4, 4
MY_PROLOG crc_end_4
align 16
main_loop_4:
movzx x1, x0_L
movzx x3, x0_H
shr x0, 16
movzx x6, x0_H
and x0, 0FFh
CRC_MOV x1, r1, 3
xor x1, [SRCDAT 1]
CRC_XOR x1, r3, 2
CRC_XOR x1, r6, 0
CRC_XOR x1, r0, 1
movzx x0, x1_L
movzx x3, x1_H
shr x1, 16
movzx x6, x1_H
and x1, 0FFh
CRC_MOV x0, r0, 3
xor x0, [SRCDAT 2]
CRC_XOR x0, r3, 2
CRC_XOR x0, r6, 0
CRC_XOR x0, r1, 1
add rD, 8
jnz main_loop_4
MY_EPILOG crc_end_4
MY_ENDP
end

View File

@@ -1,101 +0,0 @@
.386
.model flat
_TEXT$00 SEGMENT PARA PUBLIC 'CODE'
CRC1b macro
movzx EDX, BYTE PTR [ESI]
inc ESI
movzx EBX, AL
xor EDX, EBX
shr EAX, 8
xor EAX, [EBP + EDX * 4]
dec EDI
endm
data_size equ (4 + 4*4)
crc_table equ (data_size + 4)
align 16
public @CrcUpdateT8@16
@CrcUpdateT8@16:
push EBX
push ESI
push EDI
push EBP
mov EAX, ECX
mov ESI, EDX
mov EDI, [ESP + data_size]
mov EBP, [ESP + crc_table]
test EDI, EDI
jz sl_end
sl:
test ESI, 7
jz sl_end
CRC1b
jnz sl
sl_end:
cmp EDI, 16
jb crc_end
mov [ESP + data_size], EDI
sub EDI, 8
and EDI, NOT 7
sub [ESP + data_size], EDI
add EDI, ESI
xor EAX, [ESI]
mov EBX, [ESI + 4]
movzx ECX, BL
align 16
main_loop:
mov EDX, [EBP + ECX*4 + 0C00h]
movzx ECX, BH
xor EDX, [EBP + ECX*4 + 0800h]
shr EBX, 16
movzx ECX, BL
xor EDX, [EBP + ECX*4 + 0400h]
xor EDX, [ESI + 8]
movzx ECX, AL
movzx EBX, BH
xor EDX, [EBP + EBX*4 + 0000h]
mov EBX, [ESI + 12]
xor EDX, [EBP + ECX*4 + 01C00h]
movzx ECX, AH
add ESI, 8
shr EAX, 16
xor EDX, [EBP + ECX*4 + 01800h]
movzx ECX, AL
xor EDX, [EBP + ECX*4 + 01400h]
movzx ECX, AH
mov EAX, [EBP + ECX*4 + 01000h]
movzx ECX, BL
xor EAX,EDX
cmp ESI, EDI
jne main_loop
xor EAX, [ESI]
mov EDI, [ESP + data_size]
crc_end:
test EDI, EDI
jz fl_end
fl:
CRC1b
jnz fl
fl_end:
pop EBP
pop EDI
pop ESI
pop EBX
ret 8
end

742
Asm/x86/AesOpt.asm Executable file
View File

@@ -0,0 +1,742 @@
; AesOpt.asm -- AES optimized code for x86 AES hardware instructions
; 2021-12-25 : Igor Pavlov : Public domain
include 7zAsm.asm
ifdef __ASMC__
use_vaes_256 equ 1
else
ifdef ymm0
use_vaes_256 equ 1
endif
endif
ifdef use_vaes_256
ECHO "++ VAES 256"
else
ECHO "-- NO VAES 256"
endif
ifdef x64
ECHO "x86-64"
else
ECHO "x86"
if (IS_CDECL gt 0)
ECHO "ABI : CDECL"
else
ECHO "ABI : no CDECL : FASTCALL"
endif
endif
if (IS_LINUX gt 0)
ECHO "ABI : LINUX"
else
ECHO "ABI : WINDOWS"
endif
MY_ASM_START
ifndef x64
.686
.xmm
endif
; MY_ALIGN EQU ALIGN(64)
MY_ALIGN EQU
SEG_ALIGN EQU MY_ALIGN
MY_SEG_PROC macro name:req, numParams:req
; seg_name equ @CatStr(_TEXT$, name)
; seg_name SEGMENT SEG_ALIGN 'CODE'
MY_PROC name, numParams
endm
MY_SEG_ENDP macro
; seg_name ENDS
endm
NUM_AES_KEYS_MAX equ 15
; the number of push operators in function PROLOG
if (IS_LINUX eq 0) or (IS_X64 eq 0)
num_regs_push equ 2
stack_param_offset equ (REG_SIZE * (1 + num_regs_push))
endif
ifdef x64
num_param equ REG_ABI_PARAM_2
else
if (IS_CDECL gt 0)
; size_t size
; void * data
; UInt32 * aes
; ret-ip <- (r4)
aes_OFFS equ (stack_param_offset)
data_OFFS equ (REG_SIZE + aes_OFFS)
size_OFFS equ (REG_SIZE + data_OFFS)
num_param equ [r4 + size_OFFS]
else
num_param equ [r4 + stack_param_offset]
endif
endif
keys equ REG_PARAM_0 ; r1
rD equ REG_PARAM_1 ; r2
rN equ r0
koffs_x equ x7
koffs_r equ r7
ksize_x equ x6
ksize_r equ r6
keys2 equ r3
state equ xmm0
key equ xmm0
key_ymm equ ymm0
key_ymm_n equ 0
ifdef x64
ways = 11
else
ways = 4
endif
ways_start_reg equ 1
iv equ @CatStr(xmm, %(ways_start_reg + ways))
iv_ymm equ @CatStr(ymm, %(ways_start_reg + ways))
WOP macro op, op2
i = 0
rept ways
op @CatStr(xmm, %(ways_start_reg + i)), op2
i = i + 1
endm
endm
ifndef ABI_LINUX
ifdef x64
; we use 32 bytes of home space in stack in WIN64-x64
NUM_HOME_MM_REGS equ (32 / 16)
; we preserve xmm registers starting from xmm6 in WIN64-x64
MM_START_SAVE_REG equ 6
SAVE_XMM macro num_used_mm_regs:req
num_save_mm_regs = num_used_mm_regs - MM_START_SAVE_REG
if num_save_mm_regs GT 0
num_save_mm_regs2 = num_save_mm_regs - NUM_HOME_MM_REGS
; RSP is (16*x + 8) after entering the function in WIN64-x64
stack_offset = 16 * num_save_mm_regs2 + (stack_param_offset mod 16)
i = 0
rept num_save_mm_regs
if i eq NUM_HOME_MM_REGS
sub r4, stack_offset
endif
if i lt NUM_HOME_MM_REGS
movdqa [r4 + stack_param_offset + i * 16], @CatStr(xmm, %(MM_START_SAVE_REG + i))
else
movdqa [r4 + (i - NUM_HOME_MM_REGS) * 16], @CatStr(xmm, %(MM_START_SAVE_REG + i))
endif
i = i + 1
endm
endif
endm
RESTORE_XMM macro num_used_mm_regs:req
if num_save_mm_regs GT 0
i = 0
if num_save_mm_regs2 GT 0
rept num_save_mm_regs2
movdqa @CatStr(xmm, %(MM_START_SAVE_REG + NUM_HOME_MM_REGS + i)), [r4 + i * 16]
i = i + 1
endm
add r4, stack_offset
endif
num_low_regs = num_save_mm_regs - i
i = 0
rept num_low_regs
movdqa @CatStr(xmm, %(MM_START_SAVE_REG + i)), [r4 + stack_param_offset + i * 16]
i = i + 1
endm
endif
endm
endif ; x64
endif ; ABI_LINUX
MY_PROLOG macro num_used_mm_regs:req
; num_regs_push: must be equal to the number of push operators
; push r3
; push r5
if (IS_LINUX eq 0) or (IS_X64 eq 0)
push r6
push r7
endif
mov rN, num_param ; don't move it; num_param can use stack pointer (r4)
if (IS_X64 eq 0)
if (IS_CDECL gt 0)
mov rD, [r4 + data_OFFS]
mov keys, [r4 + aes_OFFS]
endif
elseif (IS_LINUX gt 0)
MY_ABI_LINUX_TO_WIN_2
endif
ifndef ABI_LINUX
ifdef x64
SAVE_XMM num_used_mm_regs
endif
endif
mov ksize_x, [keys + 16]
shl ksize_x, 5
endm
MY_EPILOG macro
ifndef ABI_LINUX
ifdef x64
RESTORE_XMM num_save_mm_regs
endif
endif
if (IS_LINUX eq 0) or (IS_X64 eq 0)
pop r7
pop r6
endif
; pop r5
; pop r3
MY_ENDP
endm
OP_KEY macro op:req, offs:req
op state, [keys + offs]
endm
WOP_KEY macro op:req, offs:req
movdqa key, [keys + offs]
WOP op, key
endm
; ---------- AES-CBC Decode ----------
XOR_WITH_DATA macro reg, _ppp_
pxor reg, [rD + i * 16]
endm
WRITE_TO_DATA macro reg, _ppp_
movdqa [rD + i * 16], reg
endm
; state0 equ @CatStr(xmm, %(ways_start_reg))
key0 equ @CatStr(xmm, %(ways_start_reg + ways + 1))
key0_ymm equ @CatStr(ymm, %(ways_start_reg + ways + 1))
key_last equ @CatStr(xmm, %(ways_start_reg + ways + 2))
key_last_ymm equ @CatStr(ymm, %(ways_start_reg + ways + 2))
key_last_ymm_n equ (ways_start_reg + ways + 2)
NUM_CBC_REGS equ (ways_start_reg + ways + 3)
MY_SEG_PROC AesCbc_Decode_HW, 3
AesCbc_Decode_HW_start::
MY_PROLOG NUM_CBC_REGS
AesCbc_Decode_HW_start_2::
movdqa iv, [keys]
add keys, 32
movdqa key0, [keys + 1 * ksize_r]
movdqa key_last, [keys]
sub ksize_x, 16
jmp check2
align 16
nextBlocks2:
WOP movdqa, [rD + i * 16]
mov koffs_x, ksize_x
; WOP_KEY pxor, ksize_r + 16
WOP pxor, key0
; align 16
@@:
WOP_KEY aesdec, 1 * koffs_r
sub koffs_r, 16
jnz @B
; WOP_KEY aesdeclast, 0
WOP aesdeclast, key_last
pxor @CatStr(xmm, %(ways_start_reg)), iv
i = 1
rept ways - 1
pxor @CatStr(xmm, %(ways_start_reg + i)), [rD + i * 16 - 16]
i = i + 1
endm
movdqa iv, [rD + ways * 16 - 16]
WOP WRITE_TO_DATA
add rD, ways * 16
AesCbc_Decode_HW_start_3::
check2:
sub rN, ways
jnc nextBlocks2
add rN, ways
sub ksize_x, 16
jmp check
nextBlock:
movdqa state, [rD]
mov koffs_x, ksize_x
; OP_KEY pxor, 1 * ksize_r + 32
pxor state, key0
; movdqa state0, [rD]
; movdqa state, key0
; pxor state, state0
@@:
OP_KEY aesdec, 1 * koffs_r + 16
OP_KEY aesdec, 1 * koffs_r
sub koffs_r, 32
jnz @B
OP_KEY aesdec, 16
; OP_KEY aesdeclast, 0
aesdeclast state, key_last
pxor state, iv
movdqa iv, [rD]
; movdqa iv, state0
movdqa [rD], state
add rD, 16
check:
sub rN, 1
jnc nextBlock
movdqa [keys - 32], iv
MY_EPILOG
; ---------- AVX ----------
AVX__WOP_n macro op
i = 0
rept ways
op (ways_start_reg + i)
i = i + 1
endm
endm
AVX__WOP macro op
i = 0
rept ways
op @CatStr(ymm, %(ways_start_reg + i))
i = i + 1
endm
endm
AVX__WOP_KEY macro op:req, offs:req
vmovdqa key_ymm, ymmword ptr [keys2 + offs]
AVX__WOP_n op
endm
AVX__CBC_START macro reg
; vpxor reg, key_ymm, ymmword ptr [rD + 32 * i]
vpxor reg, key0_ymm, ymmword ptr [rD + 32 * i]
endm
AVX__CBC_END macro reg
if i eq 0
vpxor reg, reg, iv_ymm
else
vpxor reg, reg, ymmword ptr [rD + i * 32 - 16]
endif
endm
AVX__WRITE_TO_DATA macro reg
vmovdqu ymmword ptr [rD + 32 * i], reg
endm
AVX__XOR_WITH_DATA macro reg
vpxor reg, reg, ymmword ptr [rD + 32 * i]
endm
AVX__CTR_START macro reg
vpaddq iv_ymm, iv_ymm, one_ymm
; vpxor reg, iv_ymm, key_ymm
vpxor reg, iv_ymm, key0_ymm
endm
MY_VAES_INSTR_2 macro cmd, dest, a1, a2
db 0c4H
db 2 + 040H + 020h * (1 - (a2) / 8) + 080h * (1 - (dest) / 8)
db 5 + 8 * ((not (a1)) and 15)
db cmd
db 0c0H + 8 * ((dest) and 7) + ((a2) and 7)
endm
MY_VAES_INSTR macro cmd, dest, a
MY_VAES_INSTR_2 cmd, dest, dest, a
endm
MY_vaesenc macro dest, a
MY_VAES_INSTR 0dcH, dest, a
endm
MY_vaesenclast macro dest, a
MY_VAES_INSTR 0ddH, dest, a
endm
MY_vaesdec macro dest, a
MY_VAES_INSTR 0deH, dest, a
endm
MY_vaesdeclast macro dest, a
MY_VAES_INSTR 0dfH, dest, a
endm
AVX__VAES_DEC macro reg
MY_vaesdec reg, key_ymm_n
endm
AVX__VAES_DEC_LAST_key_last macro reg
; MY_vaesdeclast reg, key_ymm_n
MY_vaesdeclast reg, key_last_ymm_n
endm
AVX__VAES_ENC macro reg
MY_vaesenc reg, key_ymm_n
endm
AVX__VAES_ENC_LAST macro reg
MY_vaesenclast reg, key_ymm_n
endm
AVX__vinserti128_TO_HIGH macro dest, src
vinserti128 dest, dest, src, 1
endm
MY_PROC AesCbc_Decode_HW_256, 3
ifdef use_vaes_256
MY_PROLOG NUM_CBC_REGS
cmp rN, ways * 2
jb AesCbc_Decode_HW_start_2
vmovdqa iv, xmmword ptr [keys]
add keys, 32
vbroadcasti128 key0_ymm, xmmword ptr [keys + 1 * ksize_r]
vbroadcasti128 key_last_ymm, xmmword ptr [keys]
sub ksize_x, 16
mov koffs_x, ksize_x
add ksize_x, ksize_x
AVX_STACK_SUB = ((NUM_AES_KEYS_MAX + 1 - 2) * 32)
push keys2
sub r4, AVX_STACK_SUB
; sub r4, 32
; sub r4, ksize_r
; lea keys2, [r4 + 32]
mov keys2, r4
and keys2, -32
broad:
vbroadcasti128 key_ymm, xmmword ptr [keys + 1 * koffs_r]
vmovdqa ymmword ptr [keys2 + koffs_r * 2], key_ymm
sub koffs_r, 16
; jnc broad
jnz broad
sub rN, ways * 2
align 16
avx_cbcdec_nextBlock2:
mov koffs_x, ksize_x
; AVX__WOP_KEY AVX__CBC_START, 1 * koffs_r + 32
AVX__WOP AVX__CBC_START
@@:
AVX__WOP_KEY AVX__VAES_DEC, 1 * koffs_r
sub koffs_r, 32
jnz @B
; AVX__WOP_KEY AVX__VAES_DEC_LAST, 0
AVX__WOP_n AVX__VAES_DEC_LAST_key_last
AVX__vinserti128_TO_HIGH iv_ymm, xmmword ptr [rD]
AVX__WOP AVX__CBC_END
vmovdqa iv, xmmword ptr [rD + ways * 32 - 16]
AVX__WOP AVX__WRITE_TO_DATA
add rD, ways * 32
sub rN, ways * 2
jnc avx_cbcdec_nextBlock2
add rN, ways * 2
shr ksize_x, 1
; lea r4, [r4 + 1 * ksize_r + 32]
add r4, AVX_STACK_SUB
pop keys2
vzeroupper
jmp AesCbc_Decode_HW_start_3
else
jmp AesCbc_Decode_HW_start
endif
MY_ENDP
MY_SEG_ENDP
; ---------- AES-CBC Encode ----------
e0 equ xmm1
CENC_START_KEY equ 2
CENC_NUM_REG_KEYS equ (3 * 2)
; last_key equ @CatStr(xmm, %(CENC_START_KEY + CENC_NUM_REG_KEYS))
MY_SEG_PROC AesCbc_Encode_HW, 3
MY_PROLOG (CENC_START_KEY + CENC_NUM_REG_KEYS + 0)
movdqa state, [keys]
add keys, 32
i = 0
rept CENC_NUM_REG_KEYS
movdqa @CatStr(xmm, %(CENC_START_KEY + i)), [keys + i * 16]
i = i + 1
endm
add keys, ksize_r
neg ksize_r
add ksize_r, (16 * CENC_NUM_REG_KEYS)
; movdqa last_key, [keys]
jmp check_e
align 16
nextBlock_e:
movdqa e0, [rD]
mov koffs_r, ksize_r
pxor e0, @CatStr(xmm, %(CENC_START_KEY))
pxor state, e0
i = 1
rept (CENC_NUM_REG_KEYS - 1)
aesenc state, @CatStr(xmm, %(CENC_START_KEY + i))
i = i + 1
endm
@@:
OP_KEY aesenc, 1 * koffs_r
OP_KEY aesenc, 1 * koffs_r + 16
add koffs_r, 32
jnz @B
OP_KEY aesenclast, 0
; aesenclast state, last_key
movdqa [rD], state
add rD, 16
check_e:
sub rN, 1
jnc nextBlock_e
; movdqa [keys - 32], state
movdqa [keys + 1 * ksize_r - (16 * CENC_NUM_REG_KEYS) - 32], state
MY_EPILOG
MY_SEG_ENDP
; ---------- AES-CTR ----------
ifdef x64
; ways = 11
endif
one equ @CatStr(xmm, %(ways_start_reg + ways + 1))
one_ymm equ @CatStr(ymm, %(ways_start_reg + ways + 1))
key0 equ @CatStr(xmm, %(ways_start_reg + ways + 2))
key0_ymm equ @CatStr(ymm, %(ways_start_reg + ways + 2))
NUM_CTR_REGS equ (ways_start_reg + ways + 3)
INIT_CTR macro reg, _ppp_
paddq iv, one
movdqa reg, iv
endm
MY_SEG_PROC AesCtr_Code_HW, 3
Ctr_start::
MY_PROLOG NUM_CTR_REGS
Ctr_start_2::
movdqa iv, [keys]
add keys, 32
movdqa key0, [keys]
add keys, ksize_r
neg ksize_r
add ksize_r, 16
Ctr_start_3::
mov koffs_x, 1
movd one, koffs_x
jmp check2_c
align 16
nextBlocks2_c:
WOP INIT_CTR, 0
mov koffs_r, ksize_r
; WOP_KEY pxor, 1 * koffs_r -16
WOP pxor, key0
@@:
WOP_KEY aesenc, 1 * koffs_r
add koffs_r, 16
jnz @B
WOP_KEY aesenclast, 0
WOP XOR_WITH_DATA
WOP WRITE_TO_DATA
add rD, ways * 16
check2_c:
sub rN, ways
jnc nextBlocks2_c
add rN, ways
sub keys, 16
add ksize_r, 16
jmp check_c
; align 16
nextBlock_c:
paddq iv, one
; movdqa state, [keys + 1 * koffs_r - 16]
movdqa state, key0
mov koffs_r, ksize_r
pxor state, iv
@@:
OP_KEY aesenc, 1 * koffs_r
OP_KEY aesenc, 1 * koffs_r + 16
add koffs_r, 32
jnz @B
OP_KEY aesenc, 0
OP_KEY aesenclast, 16
pxor state, [rD]
movdqa [rD], state
add rD, 16
check_c:
sub rN, 1
jnc nextBlock_c
; movdqa [keys - 32], iv
movdqa [keys + 1 * ksize_r - 16 - 32], iv
MY_EPILOG
MY_PROC AesCtr_Code_HW_256, 3
ifdef use_vaes_256
MY_PROLOG NUM_CTR_REGS
cmp rN, ways * 2
jb Ctr_start_2
vbroadcasti128 iv_ymm, xmmword ptr [keys]
add keys, 32
vbroadcasti128 key0_ymm, xmmword ptr [keys]
mov koffs_x, 1
vmovd one, koffs_x
vpsubq iv_ymm, iv_ymm, one_ymm
vpaddq one, one, one
AVX__vinserti128_TO_HIGH one_ymm, one
add keys, ksize_r
sub ksize_x, 16
neg ksize_r
mov koffs_r, ksize_r
add ksize_r, ksize_r
AVX_STACK_SUB = ((NUM_AES_KEYS_MAX + 1 - 1) * 32)
push keys2
lea keys2, [r4 - 32]
sub r4, AVX_STACK_SUB
and keys2, -32
vbroadcasti128 key_ymm, xmmword ptr [keys]
vmovdqa ymmword ptr [keys2], key_ymm
@@:
vbroadcasti128 key_ymm, xmmword ptr [keys + 1 * koffs_r]
vmovdqa ymmword ptr [keys2 + koffs_r * 2], key_ymm
add koffs_r, 16
jnz @B
sub rN, ways * 2
align 16
avx_ctr_nextBlock2:
mov koffs_r, ksize_r
AVX__WOP AVX__CTR_START
; AVX__WOP_KEY AVX__CTR_START, 1 * koffs_r - 32
@@:
AVX__WOP_KEY AVX__VAES_ENC, 1 * koffs_r
add koffs_r, 32
jnz @B
AVX__WOP_KEY AVX__VAES_ENC_LAST, 0
AVX__WOP AVX__XOR_WITH_DATA
AVX__WOP AVX__WRITE_TO_DATA
add rD, ways * 32
sub rN, ways * 2
jnc avx_ctr_nextBlock2
add rN, ways * 2
vextracti128 iv, iv_ymm, 1
sar ksize_r, 1
add r4, AVX_STACK_SUB
pop keys2
vzeroupper
jmp Ctr_start_3
else
jmp Ctr_start
endif
MY_ENDP
MY_SEG_ENDP
end

513
Asm/x86/LzFindOpt.asm Executable file
View File

@@ -0,0 +1,513 @@
; LzFindOpt.asm -- ASM version of GetMatchesSpecN_2() function
; 2021-07-21: Igor Pavlov : Public domain
;
ifndef x64
; x64=1
; .err <x64_IS_REQUIRED>
endif
include 7zAsm.asm
MY_ASM_START
_TEXT$LZFINDOPT SEGMENT ALIGN(64) 'CODE'
MY_ALIGN macro num:req
align num
endm
MY_ALIGN_32 macro
MY_ALIGN 32
endm
MY_ALIGN_64 macro
MY_ALIGN 64
endm
t0_L equ x0_L
t0_x equ x0
t0 equ r0
t1_x equ x3
t1 equ r3
cp_x equ t1_x
cp_r equ t1
m equ x5
m_r equ r5
len_x equ x6
len equ r6
diff_x equ x7
diff equ r7
len0 equ r10
len1_x equ x11
len1 equ r11
maxLen_x equ x12
maxLen equ r12
d equ r13
ptr0 equ r14
ptr1 equ r15
d_lim equ m_r
cycSize equ len_x
hash_lim equ len0
delta1_x equ len1_x
delta1_r equ len1
delta_x equ maxLen_x
delta_r equ maxLen
hash equ ptr0
src equ ptr1
if (IS_LINUX gt 0)
; r1 r2 r8 r9 : win32
; r7 r6 r2 r1 r8 r9 : linux
lenLimit equ r8
lenLimit_x equ x8
; pos_r equ r2
pos equ x2
cur equ r1
son equ r9
else
lenLimit equ REG_ABI_PARAM_2
lenLimit_x equ REG_ABI_PARAM_2_x
pos equ REG_ABI_PARAM_1_x
cur equ REG_ABI_PARAM_0
son equ REG_ABI_PARAM_3
endif
if (IS_LINUX gt 0)
maxLen_OFFS equ (REG_SIZE * (6 + 1))
else
cutValue_OFFS equ (REG_SIZE * (8 + 1 + 4))
d_OFFS equ (REG_SIZE + cutValue_OFFS)
maxLen_OFFS equ (REG_SIZE + d_OFFS)
endif
hash_OFFS equ (REG_SIZE + maxLen_OFFS)
limit_OFFS equ (REG_SIZE + hash_OFFS)
size_OFFS equ (REG_SIZE + limit_OFFS)
cycPos_OFFS equ (REG_SIZE + size_OFFS)
cycSize_OFFS equ (REG_SIZE + cycPos_OFFS)
posRes_OFFS equ (REG_SIZE + cycSize_OFFS)
if (IS_LINUX gt 0)
else
cutValue_PAR equ [r0 + cutValue_OFFS]
d_PAR equ [r0 + d_OFFS]
endif
maxLen_PAR equ [r0 + maxLen_OFFS]
hash_PAR equ [r0 + hash_OFFS]
limit_PAR equ [r0 + limit_OFFS]
size_PAR equ [r0 + size_OFFS]
cycPos_PAR equ [r0 + cycPos_OFFS]
cycSize_PAR equ [r0 + cycSize_OFFS]
posRes_PAR equ [r0 + posRes_OFFS]
cutValue_VAR equ DWORD PTR [r4 + 8 * 0]
cutValueCur_VAR equ DWORD PTR [r4 + 8 * 0 + 4]
cycPos_VAR equ DWORD PTR [r4 + 8 * 1 + 0]
cycSize_VAR equ DWORD PTR [r4 + 8 * 1 + 4]
hash_VAR equ QWORD PTR [r4 + 8 * 2]
limit_VAR equ QWORD PTR [r4 + 8 * 3]
size_VAR equ QWORD PTR [r4 + 8 * 4]
distances equ QWORD PTR [r4 + 8 * 5]
maxLen_VAR equ QWORD PTR [r4 + 8 * 6]
Old_RSP equ QWORD PTR [r4 + 8 * 7]
LOCAL_SIZE equ 8 * 8
COPY_VAR_32 macro dest_var, src_var
mov x3, src_var
mov dest_var, x3
endm
COPY_VAR_64 macro dest_var, src_var
mov r3, src_var
mov dest_var, r3
endm
; MY_ALIGN_64
MY_PROC GetMatchesSpecN_2, 13
MY_PUSH_PRESERVED_ABI_REGS
mov r0, RSP
lea r3, [r0 - LOCAL_SIZE]
and r3, -64
mov RSP, r3
mov Old_RSP, r0
if (IS_LINUX gt 0)
mov d, REG_ABI_PARAM_5 ; r13 = r9
mov cutValue_VAR, REG_ABI_PARAM_4_x ; = r8
mov son, REG_ABI_PARAM_3 ; r9 = r1
mov r8, REG_ABI_PARAM_2 ; r8 = r2
mov pos, REG_ABI_PARAM_1_x ; r2 = x6
mov r1, REG_ABI_PARAM_0 ; r1 = r7
else
COPY_VAR_32 cutValue_VAR, cutValue_PAR
mov d, d_PAR
endif
COPY_VAR_64 limit_VAR, limit_PAR
mov hash_lim, size_PAR
mov size_VAR, hash_lim
mov cp_x, cycPos_PAR
mov hash, hash_PAR
mov cycSize, cycSize_PAR
mov cycSize_VAR, cycSize
; we want cur in (rcx). So we change the cur and lenLimit variables
sub lenLimit, cur
neg lenLimit_x
inc lenLimit_x
mov t0_x, maxLen_PAR
sub t0, lenLimit
mov maxLen_VAR, t0
jmp main_loop
MY_ALIGN_64
fill_empty:
; ptr0 = *ptr1 = kEmptyHashValue;
mov QWORD PTR [ptr1], 0
inc pos
inc cp_x
mov DWORD PTR [d - 4], 0
cmp d, limit_VAR
jae fin
cmp hash, hash_lim
je fin
; MY_ALIGN_64
main_loop:
; UInt32 delta = *hash++;
mov diff_x, [hash] ; delta
add hash, 4
; mov cycPos_VAR, cp_x
inc cur
add d, 4
mov m, pos
sub m, diff_x; ; matchPos
; CLzRef *ptr1 = son + ((size_t)(pos) << 1) - CYC_TO_POS_OFFSET * 2;
lea ptr1, [son + 8 * cp_r]
; mov cycSize, cycSize_VAR
cmp pos, cycSize
jb directMode ; if (pos < cycSize_VAR)
; CYC MODE
cmp diff_x, cycSize
jae fill_empty ; if (delta >= cycSize_VAR)
xor t0_x, t0_x
mov cycPos_VAR, cp_x
sub cp_x, diff_x
; jae prepare_for_tree_loop
; add cp_x, cycSize
cmovb t0_x, cycSize
add cp_x, t0_x ; cp_x += (cycPos < delta ? cycSize : 0)
jmp prepare_for_tree_loop
directMode:
cmp diff_x, pos
je fill_empty ; if (delta == pos)
jae fin_error ; if (delta >= pos)
mov cycPos_VAR, cp_x
mov cp_x, m
prepare_for_tree_loop:
mov len0, lenLimit
mov hash_VAR, hash
; CLzRef *ptr0 = son + ((size_t)(pos) << 1) - CYC_TO_POS_OFFSET * 2 + 1;
lea ptr0, [ptr1 + 4]
; UInt32 *_distances = ++d;
mov distances, d
neg len0
mov len1, len0
mov t0_x, cutValue_VAR
mov maxLen, maxLen_VAR
mov cutValueCur_VAR, t0_x
MY_ALIGN_32
tree_loop:
neg diff
mov len, len0
cmp len1, len0
cmovb len, len1 ; len = (len1 < len0 ? len1 : len0);
add diff, cur
mov t0_x, [son + cp_r * 8] ; prefetch
movzx t0_x, BYTE PTR [diff + 1 * len]
lea cp_r, [son + cp_r * 8]
cmp [cur + 1 * len], t0_L
je matched_1
jb left_0
mov [ptr1], m
mov m, [cp_r + 4]
lea ptr1, [cp_r + 4]
sub diff, cur ; FIX32
jmp next_node
MY_ALIGN_32
left_0:
mov [ptr0], m
mov m, [cp_r]
mov ptr0, cp_r
sub diff, cur ; FIX32
; jmp next_node
; ------------ NEXT NODE ------------
; MY_ALIGN_32
next_node:
mov cycSize, cycSize_VAR
dec cutValueCur_VAR
je finish_tree
add diff_x, pos ; prev_match = pos + diff
cmp m, diff_x
jae fin_error ; if (new_match >= prev_match)
mov diff_x, pos
sub diff_x, m ; delta = pos - new_match
cmp pos, cycSize
jae cyc_mode_2 ; if (pos >= cycSize)
mov cp_x, m
test m, m
jne tree_loop ; if (m != 0)
finish_tree:
; ptr0 = *ptr1 = kEmptyHashValue;
mov DWORD PTR [ptr0], 0
mov DWORD PTR [ptr1], 0
inc pos
; _distances[-1] = (UInt32)(d - _distances);
mov t0, distances
mov t1, d
sub t1, t0
shr t1_x, 2
mov [t0 - 4], t1_x
cmp d, limit_VAR
jae fin ; if (d >= limit)
mov cp_x, cycPos_VAR
mov hash, hash_VAR
mov hash_lim, size_VAR
inc cp_x
cmp hash, hash_lim
jne main_loop ; if (hash != size)
jmp fin
MY_ALIGN_32
cyc_mode_2:
cmp diff_x, cycSize
jae finish_tree ; if (delta >= cycSize)
mov cp_x, cycPos_VAR
xor t0_x, t0_x
sub cp_x, diff_x ; cp_x = cycPos - delta
cmovb t0_x, cycSize
add cp_x, t0_x ; cp_x += (cycPos < delta ? cycSize : 0)
jmp tree_loop
MY_ALIGN_32
matched_1:
inc len
; cmp len_x, lenLimit_x
je short lenLimit_reach
movzx t0_x, BYTE PTR [diff + 1 * len]
cmp [cur + 1 * len], t0_L
jne mismatch
MY_ALIGN_32
match_loop:
; while (++len != lenLimit) (len[diff] != len[0]) ;
inc len
; cmp len_x, lenLimit_x
je short lenLimit_reach
movzx t0_x, BYTE PTR [diff + 1 * len]
cmp BYTE PTR [cur + 1 * len], t0_L
je match_loop
mismatch:
jb left_2
mov [ptr1], m
mov m, [cp_r + 4]
lea ptr1, [cp_r + 4]
mov len1, len
jmp max_update
MY_ALIGN_32
left_2:
mov [ptr0], m
mov m, [cp_r]
mov ptr0, cp_r
mov len0, len
max_update:
sub diff, cur ; restore diff
cmp maxLen, len
jae next_node
mov maxLen, len
add len, lenLimit
mov [d], len_x
mov t0_x, diff_x
not t0_x
mov [d + 4], t0_x
add d, 8
jmp next_node
MY_ALIGN_32
lenLimit_reach:
mov delta_r, cur
sub delta_r, diff
lea delta1_r, [delta_r - 1]
mov t0_x, [cp_r]
mov [ptr1], t0_x
mov t0_x, [cp_r + 4]
mov [ptr0], t0_x
mov [d], lenLimit_x
mov [d + 4], delta1_x
add d, 8
; _distances[-1] = (UInt32)(d - _distances);
mov t0, distances
mov t1, d
sub t1, t0
shr t1_x, 2
mov [t0 - 4], t1_x
mov hash, hash_VAR
mov hash_lim, size_VAR
inc pos
mov cp_x, cycPos_VAR
inc cp_x
mov d_lim, limit_VAR
mov cycSize, cycSize_VAR
; if (hash == size || *hash != delta || lenLimit[diff] != lenLimit[0] || d >= limit)
; break;
cmp hash, hash_lim
je fin
cmp d, d_lim
jae fin
cmp delta_x, [hash]
jne main_loop
movzx t0_x, BYTE PTR [diff]
cmp [cur], t0_L
jne main_loop
; jmp main_loop ; bypass for debug
mov cycPos_VAR, cp_x
shl len, 3 ; cycSize * 8
sub diff, cur ; restore diff
xor t0_x, t0_x
cmp cp_x, delta_x ; cmp (cycPos_VAR, delta)
lea cp_r, [son + 8 * cp_r] ; dest
lea src, [cp_r + 8 * diff]
cmovb t0, len ; t0 = (cycPos_VAR < delta ? cycSize * 8 : 0)
add src, t0
add len, son ; len = son + cycSize * 8
MY_ALIGN_32
long_loop:
add hash, 4
; *(UInt64 *)(void *)ptr = ((const UInt64 *)(const void *)ptr)[diff];
mov t0, [src]
add src, 8
mov [cp_r], t0
add cp_r, 8
cmp src, len
cmove src, son ; if end of (son) buffer is reached, we wrap to begin
mov DWORD PTR [d], 2
mov [d + 4], lenLimit_x
mov [d + 8], delta1_x
add d, 12
inc cur
cmp hash, hash_lim
je long_footer
cmp delta_x, [hash]
jne long_footer
movzx t0_x, BYTE PTR [diff + 1 * cur]
cmp [cur], t0_L
jne long_footer
cmp d, d_lim
jb long_loop
long_footer:
sub cp_r, son
shr cp_r, 3
add pos, cp_x
sub pos, cycPos_VAR
mov cycSize, cycSize_VAR
cmp d, d_lim
jae fin
cmp hash, hash_lim
jne main_loop
jmp fin
fin_error:
xor d, d
fin:
mov RSP, Old_RSP
mov t0, [r4 + posRes_OFFS]
mov [t0], pos
mov r0, d
MY_POP_PRESERVED_ABI_REGS
MY_ENDP
_TEXT$LZFINDOPT ENDS
end

1303
Asm/x86/LzmaDecOpt.asm Executable file
View File

File diff suppressed because it is too large Load Diff

263
Asm/x86/Sha1Opt.asm Executable file
View File

@@ -0,0 +1,263 @@
; Sha1Opt.asm -- SHA-1 optimized code for SHA-1 x86 hardware instructions
; 2021-03-10 : Igor Pavlov : Public domain
include 7zAsm.asm
MY_ASM_START
CONST SEGMENT
align 16
Reverse_Endian_Mask db 15,14,13,12, 11,10,9,8, 7,6,5,4, 3,2,1,0
CONST ENDS
; _TEXT$SHA1OPT SEGMENT 'CODE'
ifndef x64
.686
.xmm
endif
ifdef x64
rNum equ REG_ABI_PARAM_2
if (IS_LINUX eq 0)
LOCAL_SIZE equ (16 * 2)
endif
else
rNum equ r0
LOCAL_SIZE equ (16 * 1)
endif
rState equ REG_ABI_PARAM_0
rData equ REG_ABI_PARAM_1
MY_sha1rnds4 macro a1, a2, imm
db 0fH, 03aH, 0ccH, (0c0H + a1 * 8 + a2), imm
endm
MY_SHA_INSTR macro cmd, a1, a2
db 0fH, 038H, cmd, (0c0H + a1 * 8 + a2)
endm
cmd_sha1nexte equ 0c8H
cmd_sha1msg1 equ 0c9H
cmd_sha1msg2 equ 0caH
MY_sha1nexte macro a1, a2
MY_SHA_INSTR cmd_sha1nexte, a1, a2
endm
MY_sha1msg1 macro a1, a2
MY_SHA_INSTR cmd_sha1msg1, a1, a2
endm
MY_sha1msg2 macro a1, a2
MY_SHA_INSTR cmd_sha1msg2, a1, a2
endm
MY_PROLOG macro
ifdef x64
if (IS_LINUX eq 0)
movdqa [r4 + 8], xmm6
movdqa [r4 + 8 + 16], xmm7
sub r4, LOCAL_SIZE + 8
movdqa [r4 ], xmm8
movdqa [r4 + 16], xmm9
endif
else ; x86
if (IS_CDECL gt 0)
mov rState, [r4 + REG_SIZE * 1]
mov rData, [r4 + REG_SIZE * 2]
mov rNum, [r4 + REG_SIZE * 3]
else ; fastcall
mov rNum, [r4 + REG_SIZE * 1]
endif
push r5
mov r5, r4
and r4, -16
sub r4, LOCAL_SIZE
endif
endm
MY_EPILOG macro
ifdef x64
if (IS_LINUX eq 0)
movdqa xmm8, [r4]
movdqa xmm9, [r4 + 16]
add r4, LOCAL_SIZE + 8
movdqa xmm6, [r4 + 8]
movdqa xmm7, [r4 + 8 + 16]
endif
else ; x86
mov r4, r5
pop r5
endif
MY_ENDP
endm
e0_N equ 0
e1_N equ 1
abcd_N equ 2
e0_save_N equ 3
w_regs equ 4
e0 equ @CatStr(xmm, %e0_N)
e1 equ @CatStr(xmm, %e1_N)
abcd equ @CatStr(xmm, %abcd_N)
e0_save equ @CatStr(xmm, %e0_save_N)
ifdef x64
abcd_save equ xmm8
mask2 equ xmm9
else
abcd_save equ [r4]
mask2 equ e1
endif
LOAD_MASK macro
movdqa mask2, XMMWORD PTR Reverse_Endian_Mask
endm
LOAD_W macro k:req
movdqu @CatStr(xmm, %(w_regs + k)), [rData + (16 * (k))]
pshufb @CatStr(xmm, %(w_regs + k)), mask2
endm
; pre2 can be 2 or 3 (recommended)
pre2 equ 3
pre1 equ (pre2 + 1)
NUM_ROUNDS4 equ 20
RND4 macro k
movdqa @CatStr(xmm, %(e0_N + ((k + 1) mod 2))), abcd
MY_sha1rnds4 abcd_N, (e0_N + (k mod 2)), k / 5
nextM = (w_regs + ((k + 1) mod 4))
if (k EQ NUM_ROUNDS4 - 1)
nextM = e0_save_N
endif
MY_sha1nexte (e0_N + ((k + 1) mod 2)), nextM
if (k GE (4 - pre2)) AND (k LT (NUM_ROUNDS4 - pre2))
pxor @CatStr(xmm, %(w_regs + ((k + pre2) mod 4))), @CatStr(xmm, %(w_regs + ((k + pre2 - 2) mod 4)))
endif
if (k GE (4 - pre1)) AND (k LT (NUM_ROUNDS4 - pre1))
MY_sha1msg1 (w_regs + ((k + pre1) mod 4)), (w_regs + ((k + pre1 - 3) mod 4))
endif
if (k GE (4 - pre2)) AND (k LT (NUM_ROUNDS4 - pre2))
MY_sha1msg2 (w_regs + ((k + pre2) mod 4)), (w_regs + ((k + pre2 - 1) mod 4))
endif
endm
REVERSE_STATE macro
; abcd ; dcba
; e0 ; 000e
pshufd abcd, abcd, 01bH ; abcd
pshufd e0, e0, 01bH ; e000
endm
MY_PROC Sha1_UpdateBlocks_HW, 3
MY_PROLOG
cmp rNum, 0
je end_c
movdqu abcd, [rState] ; dcba
movd e0, dword ptr [rState + 16] ; 000e
REVERSE_STATE
ifdef x64
LOAD_MASK
endif
align 16
nextBlock:
movdqa abcd_save, abcd
movdqa e0_save, e0
ifndef x64
LOAD_MASK
endif
LOAD_W 0
LOAD_W 1
LOAD_W 2
LOAD_W 3
paddd e0, @CatStr(xmm, %(w_regs))
k = 0
rept NUM_ROUNDS4
RND4 k
k = k + 1
endm
paddd abcd, abcd_save
add rData, 64
sub rNum, 1
jnz nextBlock
REVERSE_STATE
movdqu [rState], abcd
movd dword ptr [rState + 16], e0
end_c:
MY_EPILOG
; _TEXT$SHA1OPT ENDS
end

275
Asm/x86/Sha256Opt.asm Executable file
View File

@@ -0,0 +1,275 @@
; Sha256Opt.asm -- SHA-256 optimized code for SHA-256 x86 hardware instructions
; 2022-04-17 : Igor Pavlov : Public domain
include 7zAsm.asm
MY_ASM_START
; .data
; public K
; we can use external SHA256_K_ARRAY defined in Sha256.c
; but we must guarantee that SHA256_K_ARRAY is aligned for 16-bytes
COMMENT @
ifdef x64
K_CONST equ SHA256_K_ARRAY
else
K_CONST equ _SHA256_K_ARRAY
endif
EXTRN K_CONST:xmmword
@
CONST SEGMENT
align 16
Reverse_Endian_Mask db 3,2,1,0, 7,6,5,4, 11,10,9,8, 15,14,13,12
; COMMENT @
align 16
K_CONST \
DD 0428a2f98H, 071374491H, 0b5c0fbcfH, 0e9b5dba5H
DD 03956c25bH, 059f111f1H, 0923f82a4H, 0ab1c5ed5H
DD 0d807aa98H, 012835b01H, 0243185beH, 0550c7dc3H
DD 072be5d74H, 080deb1feH, 09bdc06a7H, 0c19bf174H
DD 0e49b69c1H, 0efbe4786H, 00fc19dc6H, 0240ca1ccH
DD 02de92c6fH, 04a7484aaH, 05cb0a9dcH, 076f988daH
DD 0983e5152H, 0a831c66dH, 0b00327c8H, 0bf597fc7H
DD 0c6e00bf3H, 0d5a79147H, 006ca6351H, 014292967H
DD 027b70a85H, 02e1b2138H, 04d2c6dfcH, 053380d13H
DD 0650a7354H, 0766a0abbH, 081c2c92eH, 092722c85H
DD 0a2bfe8a1H, 0a81a664bH, 0c24b8b70H, 0c76c51a3H
DD 0d192e819H, 0d6990624H, 0f40e3585H, 0106aa070H
DD 019a4c116H, 01e376c08H, 02748774cH, 034b0bcb5H
DD 0391c0cb3H, 04ed8aa4aH, 05b9cca4fH, 0682e6ff3H
DD 0748f82eeH, 078a5636fH, 084c87814H, 08cc70208H
DD 090befffaH, 0a4506cebH, 0bef9a3f7H, 0c67178f2H
; @
CONST ENDS
; _TEXT$SHA256OPT SEGMENT 'CODE'
ifndef x64
.686
.xmm
endif
; jwasm-based assemblers for linux and linker from new versions of binutils
; can generate incorrect code for load [ARRAY + offset] instructions.
; 22.00: we load K_CONST offset to (rTable) register to avoid jwasm+binutils problem
rTable equ r0
; rTable equ K_CONST
ifdef x64
rNum equ REG_ABI_PARAM_2
if (IS_LINUX eq 0)
LOCAL_SIZE equ (16 * 2)
endif
else
rNum equ r3
LOCAL_SIZE equ (16 * 1)
endif
rState equ REG_ABI_PARAM_0
rData equ REG_ABI_PARAM_1
MY_SHA_INSTR macro cmd, a1, a2
db 0fH, 038H, cmd, (0c0H + a1 * 8 + a2)
endm
cmd_sha256rnds2 equ 0cbH
cmd_sha256msg1 equ 0ccH
cmd_sha256msg2 equ 0cdH
MY_sha256rnds2 macro a1, a2
MY_SHA_INSTR cmd_sha256rnds2, a1, a2
endm
MY_sha256msg1 macro a1, a2
MY_SHA_INSTR cmd_sha256msg1, a1, a2
endm
MY_sha256msg2 macro a1, a2
MY_SHA_INSTR cmd_sha256msg2, a1, a2
endm
MY_PROLOG macro
ifdef x64
if (IS_LINUX eq 0)
movdqa [r4 + 8], xmm6
movdqa [r4 + 8 + 16], xmm7
sub r4, LOCAL_SIZE + 8
movdqa [r4 ], xmm8
movdqa [r4 + 16], xmm9
endif
else ; x86
push r3
push r5
mov r5, r4
NUM_PUSH_REGS equ 2
PARAM_OFFSET equ (REG_SIZE * (1 + NUM_PUSH_REGS))
if (IS_CDECL gt 0)
mov rState, [r4 + PARAM_OFFSET]
mov rData, [r4 + PARAM_OFFSET + REG_SIZE * 1]
mov rNum, [r4 + PARAM_OFFSET + REG_SIZE * 2]
else ; fastcall
mov rNum, [r4 + PARAM_OFFSET]
endif
and r4, -16
sub r4, LOCAL_SIZE
endif
endm
MY_EPILOG macro
ifdef x64
if (IS_LINUX eq 0)
movdqa xmm8, [r4]
movdqa xmm9, [r4 + 16]
add r4, LOCAL_SIZE + 8
movdqa xmm6, [r4 + 8]
movdqa xmm7, [r4 + 8 + 16]
endif
else ; x86
mov r4, r5
pop r5
pop r3
endif
MY_ENDP
endm
msg equ xmm0
tmp equ xmm0
state0_N equ 2
state1_N equ 3
w_regs equ 4
state1_save equ xmm1
state0 equ @CatStr(xmm, %state0_N)
state1 equ @CatStr(xmm, %state1_N)
ifdef x64
state0_save equ xmm8
mask2 equ xmm9
else
state0_save equ [r4]
mask2 equ xmm0
endif
LOAD_MASK macro
movdqa mask2, XMMWORD PTR Reverse_Endian_Mask
endm
LOAD_W macro k:req
movdqu @CatStr(xmm, %(w_regs + k)), [rData + (16 * (k))]
pshufb @CatStr(xmm, %(w_regs + k)), mask2
endm
; pre1 <= 4 && pre2 >= 1 && pre1 > pre2 && (pre1 - pre2) <= 1
pre1 equ 3
pre2 equ 2
RND4 macro k
movdqa msg, xmmword ptr [rTable + (k) * 16]
paddd msg, @CatStr(xmm, %(w_regs + ((k + 0) mod 4)))
MY_sha256rnds2 state0_N, state1_N
pshufd msg, msg, 0eH
if (k GE (4 - pre1)) AND (k LT (16 - pre1))
; w4[0] = msg1(w4[-4], w4[-3])
MY_sha256msg1 (w_regs + ((k + pre1) mod 4)), (w_regs + ((k + pre1 - 3) mod 4))
endif
MY_sha256rnds2 state1_N, state0_N
if (k GE (4 - pre2)) AND (k LT (16 - pre2))
movdqa tmp, @CatStr(xmm, %(w_regs + ((k + pre2 - 1) mod 4)))
palignr tmp, @CatStr(xmm, %(w_regs + ((k + pre2 - 2) mod 4))), 4
paddd @CatStr(xmm, %(w_regs + ((k + pre2) mod 4))), tmp
; w4[0] = msg2(w4[0], w4[-1])
MY_sha256msg2 %(w_regs + ((k + pre2) mod 4)), %(w_regs + ((k + pre2 - 1) mod 4))
endif
endm
REVERSE_STATE macro
; state0 ; dcba
; state1 ; hgfe
pshufd tmp, state0, 01bH ; abcd
pshufd state0, state1, 01bH ; efgh
movdqa state1, state0 ; efgh
punpcklqdq state0, tmp ; cdgh
punpckhqdq state1, tmp ; abef
endm
MY_PROC Sha256_UpdateBlocks_HW, 3
MY_PROLOG
lea rTable, [K_CONST]
cmp rNum, 0
je end_c
movdqu state0, [rState] ; dcba
movdqu state1, [rState + 16] ; hgfe
REVERSE_STATE
ifdef x64
LOAD_MASK
endif
align 16
nextBlock:
movdqa state0_save, state0
movdqa state1_save, state1
ifndef x64
LOAD_MASK
endif
LOAD_W 0
LOAD_W 1
LOAD_W 2
LOAD_W 3
k = 0
rept 16
RND4 k
k = k + 1
endm
paddd state0, state0_save
paddd state1, state1_save
add rData, 64
sub rNum, 1
jnz nextBlock
REVERSE_STATE
movdqu [rState], state0
movdqu [rState + 16], state1
end_c:
MY_EPILOG
; _TEXT$SHA256OPT ENDS
end

239
Asm/x86/XzCrc64Opt.asm Executable file
View File

@@ -0,0 +1,239 @@
; XzCrc64Opt.asm -- CRC64 calculation : optimized version
; 2021-02-06 : Igor Pavlov : Public domain
include 7zAsm.asm
MY_ASM_START
ifdef x64
rD equ r9
rN equ r10
rT equ r5
num_VAR equ r8
SRCDAT4 equ dword ptr [rD + rN * 1]
CRC_XOR macro dest:req, src:req, t:req
xor dest, QWORD PTR [rT + src * 8 + 0800h * t]
endm
CRC1b macro
movzx x6, BYTE PTR [rD]
inc rD
movzx x3, x0_L
xor x6, x3
shr r0, 8
CRC_XOR r0, r6, 0
dec rN
endm
MY_PROLOG macro crc_end:req
ifdef ABI_LINUX
MY_PUSH_2_REGS
else
MY_PUSH_4_REGS
endif
mov r0, REG_ABI_PARAM_0
mov rN, REG_ABI_PARAM_2
mov rT, REG_ABI_PARAM_3
mov rD, REG_ABI_PARAM_1
test rN, rN
jz crc_end
@@:
test rD, 3
jz @F
CRC1b
jnz @B
@@:
cmp rN, 8
jb crc_end
add rN, rD
mov num_VAR, rN
sub rN, 4
and rN, NOT 3
sub rD, rN
mov x1, SRCDAT4
xor r0, r1
add rN, 4
endm
MY_EPILOG macro crc_end:req
sub rN, 4
mov x1, SRCDAT4
xor r0, r1
mov rD, rN
mov rN, num_VAR
sub rN, rD
crc_end:
test rN, rN
jz @F
CRC1b
jmp crc_end
@@:
ifdef ABI_LINUX
MY_POP_2_REGS
else
MY_POP_4_REGS
endif
endm
MY_PROC XzCrc64UpdateT4, 4
MY_PROLOG crc_end_4
align 16
main_loop_4:
mov x1, SRCDAT4
movzx x2, x0_L
movzx x3, x0_H
shr r0, 16
movzx x6, x0_L
movzx x7, x0_H
shr r0, 16
CRC_XOR r1, r2, 3
CRC_XOR r0, r3, 2
CRC_XOR r1, r6, 1
CRC_XOR r0, r7, 0
xor r0, r1
add rD, 4
jnz main_loop_4
MY_EPILOG crc_end_4
MY_ENDP
else
; x86 (32-bit)
rD equ r1
rN equ r7
rT equ r5
crc_OFFS equ (REG_SIZE * 5)
if (IS_CDECL gt 0) or (IS_LINUX gt 0)
; cdecl or (GNU fastcall) stack:
; (UInt32 *) table
; size_t size
; void * data
; (UInt64) crc
; ret-ip <-(r4)
data_OFFS equ (8 + crc_OFFS)
size_OFFS equ (REG_SIZE + data_OFFS)
table_OFFS equ (REG_SIZE + size_OFFS)
num_VAR equ [r4 + size_OFFS]
table_VAR equ [r4 + table_OFFS]
else
; Windows fastcall:
; r1 = data, r2 = size
; stack:
; (UInt32 *) table
; (UInt64) crc
; ret-ip <-(r4)
table_OFFS equ (8 + crc_OFFS)
table_VAR equ [r4 + table_OFFS]
num_VAR equ table_VAR
endif
SRCDAT4 equ dword ptr [rD + rN * 1]
CRC macro op0:req, op1:req, dest0:req, dest1:req, src:req, t:req
op0 dest0, DWORD PTR [rT + src * 8 + 0800h * t]
op1 dest1, DWORD PTR [rT + src * 8 + 0800h * t + 4]
endm
CRC_XOR macro dest0:req, dest1:req, src:req, t:req
CRC xor, xor, dest0, dest1, src, t
endm
CRC1b macro
movzx x6, BYTE PTR [rD]
inc rD
movzx x3, x0_L
xor x6, x3
shrd r0, r2, 8
shr r2, 8
CRC_XOR r0, r2, r6, 0
dec rN
endm
MY_PROLOG macro crc_end:req
MY_PUSH_4_REGS
if (IS_CDECL gt 0) or (IS_LINUX gt 0)
proc_numParams = proc_numParams + 2 ; for ABI_LINUX
mov rN, [r4 + size_OFFS]
mov rD, [r4 + data_OFFS]
else
mov rN, r2
endif
mov x0, [r4 + crc_OFFS]
mov x2, [r4 + crc_OFFS + 4]
mov rT, table_VAR
test rN, rN
jz crc_end
@@:
test rD, 3
jz @F
CRC1b
jnz @B
@@:
cmp rN, 8
jb crc_end
add rN, rD
mov num_VAR, rN
sub rN, 4
and rN, NOT 3
sub rD, rN
xor r0, SRCDAT4
add rN, 4
endm
MY_EPILOG macro crc_end:req
sub rN, 4
xor r0, SRCDAT4
mov rD, rN
mov rN, num_VAR
sub rN, rD
crc_end:
test rN, rN
jz @F
CRC1b
jmp crc_end
@@:
MY_POP_4_REGS
endm
MY_PROC XzCrc64UpdateT4, 5
MY_PROLOG crc_end_4
movzx x6, x0_L
align 16
main_loop_4:
mov r3, SRCDAT4
xor r3, r2
CRC xor, mov, r3, r2, r6, 3
movzx x6, x0_H
shr r0, 16
CRC_XOR r3, r2, r6, 2
movzx x6, x0_L
movzx x0, x0_H
CRC_XOR r3, r2, r6, 1
CRC_XOR r3, r2, r0, 0
movzx x6, x3_L
mov r0, r3
add rD, 4
jnz main_loop_4
MY_EPILOG crc_end_4
MY_ENDP
endif ; ! x64
end

204
C/7z.h Executable file
View File

@@ -0,0 +1,204 @@
/* 7z.h -- 7z interface
2023-04-02 : Igor Pavlov : Public domain */
#ifndef ZIP7_INC_7Z_H
#define ZIP7_INC_7Z_H
#include "7zTypes.h"
EXTERN_C_BEGIN
#define k7zStartHeaderSize 0x20
#define k7zSignatureSize 6
extern const Byte k7zSignature[k7zSignatureSize];
typedef struct
{
const Byte *Data;
size_t Size;
} CSzData;
/* CSzCoderInfo & CSzFolder support only default methods */
typedef struct
{
size_t PropsOffset;
UInt32 MethodID;
Byte NumStreams;
Byte PropsSize;
} CSzCoderInfo;
typedef struct
{
UInt32 InIndex;
UInt32 OutIndex;
} CSzBond;
#define SZ_NUM_CODERS_IN_FOLDER_MAX 4
#define SZ_NUM_BONDS_IN_FOLDER_MAX 3
#define SZ_NUM_PACK_STREAMS_IN_FOLDER_MAX 4
typedef struct
{
UInt32 NumCoders;
UInt32 NumBonds;
UInt32 NumPackStreams;
UInt32 UnpackStream;
UInt32 PackStreams[SZ_NUM_PACK_STREAMS_IN_FOLDER_MAX];
CSzBond Bonds[SZ_NUM_BONDS_IN_FOLDER_MAX];
CSzCoderInfo Coders[SZ_NUM_CODERS_IN_FOLDER_MAX];
} CSzFolder;
SRes SzGetNextFolderItem(CSzFolder *f, CSzData *sd);
typedef struct
{
UInt32 Low;
UInt32 High;
} CNtfsFileTime;
typedef struct
{
Byte *Defs; /* MSB 0 bit numbering */
UInt32 *Vals;
} CSzBitUi32s;
typedef struct
{
Byte *Defs; /* MSB 0 bit numbering */
// UInt64 *Vals;
CNtfsFileTime *Vals;
} CSzBitUi64s;
#define SzBitArray_Check(p, i) (((p)[(i) >> 3] & (0x80 >> ((i) & 7))) != 0)
#define SzBitWithVals_Check(p, i) ((p)->Defs && ((p)->Defs[(i) >> 3] & (0x80 >> ((i) & 7))) != 0)
typedef struct
{
UInt32 NumPackStreams;
UInt32 NumFolders;
UInt64 *PackPositions; // NumPackStreams + 1
CSzBitUi32s FolderCRCs; // NumFolders
size_t *FoCodersOffsets; // NumFolders + 1
UInt32 *FoStartPackStreamIndex; // NumFolders + 1
UInt32 *FoToCoderUnpackSizes; // NumFolders + 1
Byte *FoToMainUnpackSizeIndex; // NumFolders
UInt64 *CoderUnpackSizes; // for all coders in all folders
Byte *CodersData;
UInt64 RangeLimit;
} CSzAr;
UInt64 SzAr_GetFolderUnpackSize(const CSzAr *p, UInt32 folderIndex);
SRes SzAr_DecodeFolder(const CSzAr *p, UInt32 folderIndex,
ILookInStreamPtr stream, UInt64 startPos,
Byte *outBuffer, size_t outSize,
ISzAllocPtr allocMain);
typedef struct
{
CSzAr db;
UInt64 startPosAfterHeader;
UInt64 dataPos;
UInt32 NumFiles;
UInt64 *UnpackPositions; // NumFiles + 1
// Byte *IsEmptyFiles;
Byte *IsDirs;
CSzBitUi32s CRCs;
CSzBitUi32s Attribs;
// CSzBitUi32s Parents;
CSzBitUi64s MTime;
CSzBitUi64s CTime;
UInt32 *FolderToFile; // NumFolders + 1
UInt32 *FileToFolder; // NumFiles
size_t *FileNameOffsets; /* in 2-byte steps */
Byte *FileNames; /* UTF-16-LE */
} CSzArEx;
#define SzArEx_IsDir(p, i) (SzBitArray_Check((p)->IsDirs, i))
#define SzArEx_GetFileSize(p, i) ((p)->UnpackPositions[(i) + 1] - (p)->UnpackPositions[i])
void SzArEx_Init(CSzArEx *p);
void SzArEx_Free(CSzArEx *p, ISzAllocPtr alloc);
UInt64 SzArEx_GetFolderStreamPos(const CSzArEx *p, UInt32 folderIndex, UInt32 indexInFolder);
int SzArEx_GetFolderFullPackSize(const CSzArEx *p, UInt32 folderIndex, UInt64 *resSize);
/*
if dest == NULL, the return value specifies the required size of the buffer,
in 16-bit characters, including the null-terminating character.
if dest != NULL, the return value specifies the number of 16-bit characters that
are written to the dest, including the null-terminating character. */
size_t SzArEx_GetFileNameUtf16(const CSzArEx *p, size_t fileIndex, UInt16 *dest);
/*
size_t SzArEx_GetFullNameLen(const CSzArEx *p, size_t fileIndex);
UInt16 *SzArEx_GetFullNameUtf16_Back(const CSzArEx *p, size_t fileIndex, UInt16 *dest);
*/
/*
SzArEx_Extract extracts file from archive
*outBuffer must be 0 before first call for each new archive.
Extracting cache:
If you need to decompress more than one file, you can send
these values from previous call:
*blockIndex,
*outBuffer,
*outBufferSize
You can consider "*outBuffer" as cache of solid block. If your archive is solid,
it will increase decompression speed.
If you use external function, you can declare these 3 cache variables
(blockIndex, outBuffer, outBufferSize) as static in that external function.
Free *outBuffer and set *outBuffer to 0, if you want to flush cache.
*/
SRes SzArEx_Extract(
const CSzArEx *db,
ILookInStreamPtr inStream,
UInt32 fileIndex, /* index of file */
UInt32 *blockIndex, /* index of solid block */
Byte **outBuffer, /* pointer to pointer to output buffer (allocated with allocMain) */
size_t *outBufferSize, /* buffer size for output buffer */
size_t *offset, /* offset of stream for required file in *outBuffer */
size_t *outSizeProcessed, /* size of file in *outBuffer */
ISzAllocPtr allocMain,
ISzAllocPtr allocTemp);
/*
SzArEx_Open Errors:
SZ_ERROR_NO_ARCHIVE
SZ_ERROR_ARCHIVE
SZ_ERROR_UNSUPPORTED
SZ_ERROR_MEM
SZ_ERROR_CRC
SZ_ERROR_INPUT_EOF
SZ_ERROR_FAIL
*/
SRes SzArEx_Open(CSzArEx *p, ILookInStreamPtr inStream,
ISzAllocPtr allocMain, ISzAllocPtr allocTemp);
EXTERN_C_END
#endif

89
C/7zAlloc.c Executable file
View File

@@ -0,0 +1,89 @@
/* 7zAlloc.c -- Allocation functions for 7z processing
2023-03-04 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include <stdlib.h>
#include "7zAlloc.h"
/* #define SZ_ALLOC_DEBUG */
/* use SZ_ALLOC_DEBUG to debug alloc/free operations */
#ifdef SZ_ALLOC_DEBUG
/*
#ifdef _WIN32
#include "7zWindows.h"
#endif
*/
#include <stdio.h>
static int g_allocCount = 0;
static int g_allocCountTemp = 0;
static void Print_Alloc(const char *s, size_t size, int *counter)
{
const unsigned size2 = (unsigned)size;
fprintf(stderr, "\n%s count = %10d : %10u bytes; ", s, *counter, size2);
(*counter)++;
}
static void Print_Free(const char *s, int *counter)
{
(*counter)--;
fprintf(stderr, "\n%s count = %10d", s, *counter);
}
#endif
void *SzAlloc(ISzAllocPtr p, size_t size)
{
UNUSED_VAR(p)
if (size == 0)
return 0;
#ifdef SZ_ALLOC_DEBUG
Print_Alloc("Alloc", size, &g_allocCount);
#endif
return malloc(size);
}
void SzFree(ISzAllocPtr p, void *address)
{
UNUSED_VAR(p)
#ifdef SZ_ALLOC_DEBUG
if (address)
Print_Free("Free ", &g_allocCount);
#endif
free(address);
}
void *SzAllocTemp(ISzAllocPtr p, size_t size)
{
UNUSED_VAR(p)
if (size == 0)
return 0;
#ifdef SZ_ALLOC_DEBUG
Print_Alloc("Alloc_temp", size, &g_allocCountTemp);
/*
#ifdef _WIN32
return HeapAlloc(GetProcessHeap(), 0, size);
#endif
*/
#endif
return malloc(size);
}
void SzFreeTemp(ISzAllocPtr p, void *address)
{
UNUSED_VAR(p)
#ifdef SZ_ALLOC_DEBUG
if (address)
Print_Free("Free_temp ", &g_allocCountTemp);
/*
#ifdef _WIN32
HeapFree(GetProcessHeap(), 0, address);
return;
#endif
*/
#endif
free(address);
}

19
C/7zAlloc.h Executable file
View File

@@ -0,0 +1,19 @@
/* 7zAlloc.h -- Allocation functions
2023-03-04 : Igor Pavlov : Public domain */
#ifndef ZIP7_INC_7Z_ALLOC_H
#define ZIP7_INC_7Z_ALLOC_H
#include "7zTypes.h"
EXTERN_C_BEGIN
void *SzAlloc(ISzAllocPtr p, size_t size);
void SzFree(ISzAllocPtr p, void *address);
void *SzAllocTemp(ISzAllocPtr p, size_t size);
void SzFreeTemp(ISzAllocPtr p, void *address);
EXTERN_C_END
#endif

1786
C/7zArcIn.c Executable file
View File

File diff suppressed because it is too large Load Diff

View File

@@ -1,7 +1,7 @@
/* 7zBuf.c -- Byte Buffer
2008-03-28
Igor Pavlov
Public domain */
2017-04-03 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include "7zBuf.h"
@@ -11,7 +11,7 @@ void Buf_Init(CBuf *p)
p->size = 0;
}
int Buf_Create(CBuf *p, size_t size, ISzAlloc *alloc)
int Buf_Create(CBuf *p, size_t size, ISzAllocPtr alloc)
{
p->size = 0;
if (size == 0)
@@ -19,8 +19,8 @@ int Buf_Create(CBuf *p, size_t size, ISzAlloc *alloc)
p->data = 0;
return 1;
}
p->data = (Byte *)alloc->Alloc(alloc, size);
if (p->data != 0)
p->data = (Byte *)ISzAlloc_Alloc(alloc, size);
if (p->data)
{
p->size = size;
return 1;
@@ -28,9 +28,9 @@ int Buf_Create(CBuf *p, size_t size, ISzAlloc *alloc)
return 0;
}
void Buf_Free(CBuf *p, ISzAlloc *alloc)
void Buf_Free(CBuf *p, ISzAllocPtr alloc)
{
alloc->Free(alloc, p->data);
ISzAlloc_Free(alloc, p->data);
p->data = 0;
p->size = 0;
}

View File

@@ -1,10 +1,12 @@
/* 7zBuf.h -- Byte Buffer
2008-10-04 : Igor Pavlov : Public domain */
2023-03-04 : Igor Pavlov : Public domain */
#ifndef __7Z_BUF_H
#define __7Z_BUF_H
#ifndef ZIP7_INC_7Z_BUF_H
#define ZIP7_INC_7Z_BUF_H
#include "Types.h"
#include "7zTypes.h"
EXTERN_C_BEGIN
typedef struct
{
@@ -13,8 +15,8 @@ typedef struct
} CBuf;
void Buf_Init(CBuf *p);
int Buf_Create(CBuf *p, size_t size, ISzAlloc *alloc);
void Buf_Free(CBuf *p, ISzAlloc *alloc);
int Buf_Create(CBuf *p, size_t size, ISzAllocPtr alloc);
void Buf_Free(CBuf *p, ISzAllocPtr alloc);
typedef struct
{
@@ -25,7 +27,9 @@ typedef struct
void DynBuf_Construct(CDynBuf *p);
void DynBuf_SeekToBeg(CDynBuf *p);
int DynBuf_Write(CDynBuf *p, const Byte *buf, size_t size, ISzAlloc *alloc);
void DynBuf_Free(CDynBuf *p, ISzAlloc *alloc);
int DynBuf_Write(CDynBuf *p, const Byte *buf, size_t size, ISzAllocPtr alloc);
void DynBuf_Free(CDynBuf *p, ISzAllocPtr alloc);
EXTERN_C_END
#endif

View File

@@ -1,7 +1,10 @@
/* 7zBuf2.c -- Byte Buffer
2008-10-04 : Igor Pavlov : Public domain */
2017-04-03 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include <string.h>
#include "7zBuf.h"
void DynBuf_Construct(CDynBuf *p)
@@ -16,29 +19,33 @@ void DynBuf_SeekToBeg(CDynBuf *p)
p->pos = 0;
}
int DynBuf_Write(CDynBuf *p, const Byte *buf, size_t size, ISzAlloc *alloc)
int DynBuf_Write(CDynBuf *p, const Byte *buf, size_t size, ISzAllocPtr alloc)
{
if (size > p->size - p->pos)
{
size_t newSize = p->pos + size;
Byte *data;
newSize += newSize / 4;
data = (Byte *)alloc->Alloc(alloc, newSize);
if (data == 0)
data = (Byte *)ISzAlloc_Alloc(alloc, newSize);
if (!data)
return 0;
p->size = newSize;
memcpy(data, p->data, p->pos);
alloc->Free(alloc, p->data);
if (p->pos != 0)
memcpy(data, p->data, p->pos);
ISzAlloc_Free(alloc, p->data);
p->data = data;
}
memcpy(p->data + p->pos, buf, size);
p->pos += size;
if (size != 0)
{
memcpy(p->data + p->pos, buf, size);
p->pos += size;
}
return 1;
}
void DynBuf_Free(CDynBuf *p, ISzAlloc *alloc)
void DynBuf_Free(CDynBuf *p, ISzAllocPtr alloc)
{
alloc->Free(alloc, p->data);
ISzAlloc_Free(alloc, p->data);
p->data = 0;
p->size = 0;
p->pos = 0;

345
C/7zCrc.c
View File

@@ -1,35 +1,340 @@
/* 7zCrc.c -- CRC32 calculation
2008-08-05
Igor Pavlov
Public domain */
/* 7zCrc.c -- CRC32 calculation and init
2023-04-02 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include "7zCrc.h"
#include "CpuArch.h"
#define kCrcPoly 0xEDB88320
UInt32 g_CrcTable[256];
void MY_FAST_CALL CrcGenerateTable(void)
#ifdef MY_CPU_LE
#define CRC_NUM_TABLES 8
#else
#define CRC_NUM_TABLES 9
UInt32 Z7_FASTCALL CrcUpdateT1_BeT4(UInt32 v, const void *data, size_t size, const UInt32 *table);
UInt32 Z7_FASTCALL CrcUpdateT1_BeT8(UInt32 v, const void *data, size_t size, const UInt32 *table);
#endif
#ifndef MY_CPU_BE
UInt32 Z7_FASTCALL CrcUpdateT4(UInt32 v, const void *data, size_t size, const UInt32 *table);
UInt32 Z7_FASTCALL CrcUpdateT8(UInt32 v, const void *data, size_t size, const UInt32 *table);
#endif
/*
extern
CRC_FUNC g_CrcUpdateT4;
CRC_FUNC g_CrcUpdateT4;
*/
extern
CRC_FUNC g_CrcUpdateT8;
CRC_FUNC g_CrcUpdateT8;
extern
CRC_FUNC g_CrcUpdateT0_32;
CRC_FUNC g_CrcUpdateT0_32;
extern
CRC_FUNC g_CrcUpdateT0_64;
CRC_FUNC g_CrcUpdateT0_64;
extern
CRC_FUNC g_CrcUpdate;
CRC_FUNC g_CrcUpdate;
UInt32 g_CrcTable[256 * CRC_NUM_TABLES];
UInt32 Z7_FASTCALL CrcUpdate(UInt32 v, const void *data, size_t size)
{
return g_CrcUpdate(v, data, size, g_CrcTable);
}
UInt32 Z7_FASTCALL CrcCalc(const void *data, size_t size)
{
return g_CrcUpdate(CRC_INIT_VAL, data, size, g_CrcTable) ^ CRC_INIT_VAL;
}
#if CRC_NUM_TABLES < 4 \
|| (CRC_NUM_TABLES == 4 && defined(MY_CPU_BE)) \
|| (!defined(MY_CPU_LE) && !defined(MY_CPU_BE))
#define CRC_UPDATE_BYTE_2(crc, b) (table[((crc) ^ (b)) & 0xFF] ^ ((crc) >> 8))
UInt32 Z7_FASTCALL CrcUpdateT1(UInt32 v, const void *data, size_t size, const UInt32 *table);
UInt32 Z7_FASTCALL CrcUpdateT1(UInt32 v, const void *data, size_t size, const UInt32 *table)
{
const Byte *p = (const Byte *)data;
const Byte *pEnd = p + size;
for (; p != pEnd; p++)
v = CRC_UPDATE_BYTE_2(v, *p);
return v;
}
#endif
/* ---------- hardware CRC ---------- */
#ifdef MY_CPU_LE
#if defined(MY_CPU_ARM_OR_ARM64)
// #pragma message("ARM*")
#if defined(_MSC_VER)
#if defined(MY_CPU_ARM64)
#if (_MSC_VER >= 1910)
#ifndef __clang__
#define USE_ARM64_CRC
#include <intrin.h>
#endif
#endif
#endif
#elif (defined(__clang__) && (__clang_major__ >= 3)) \
|| (defined(__GNUC__) && (__GNUC__ > 4))
#if !defined(__ARM_FEATURE_CRC32)
#define __ARM_FEATURE_CRC32 1
#if defined(__clang__)
#if defined(MY_CPU_ARM64)
#define ATTRIB_CRC __attribute__((__target__("crc")))
#else
#define ATTRIB_CRC __attribute__((__target__("armv8-a,crc")))
#endif
#else
#if defined(MY_CPU_ARM64)
#define ATTRIB_CRC __attribute__((__target__("+crc")))
#else
#define ATTRIB_CRC __attribute__((__target__("arch=armv8-a+crc")))
#endif
#endif
#endif
#if defined(__ARM_FEATURE_CRC32)
#define USE_ARM64_CRC
#include <arm_acle.h>
#endif
#endif
#else
// no hardware CRC
// #define USE_CRC_EMU
#ifdef USE_CRC_EMU
#pragma message("ARM64 CRC emulation")
Z7_FORCE_INLINE
UInt32 __crc32b(UInt32 v, UInt32 data)
{
const UInt32 *table = g_CrcTable;
v = CRC_UPDATE_BYTE_2(v, (Byte)data);
return v;
}
Z7_FORCE_INLINE
UInt32 __crc32w(UInt32 v, UInt32 data)
{
const UInt32 *table = g_CrcTable;
v = CRC_UPDATE_BYTE_2(v, (Byte)data); data >>= 8;
v = CRC_UPDATE_BYTE_2(v, (Byte)data); data >>= 8;
v = CRC_UPDATE_BYTE_2(v, (Byte)data); data >>= 8;
v = CRC_UPDATE_BYTE_2(v, (Byte)data); data >>= 8;
return v;
}
Z7_FORCE_INLINE
UInt32 __crc32d(UInt32 v, UInt64 data)
{
const UInt32 *table = g_CrcTable;
v = CRC_UPDATE_BYTE_2(v, (Byte)data); data >>= 8;
v = CRC_UPDATE_BYTE_2(v, (Byte)data); data >>= 8;
v = CRC_UPDATE_BYTE_2(v, (Byte)data); data >>= 8;
v = CRC_UPDATE_BYTE_2(v, (Byte)data); data >>= 8;
v = CRC_UPDATE_BYTE_2(v, (Byte)data); data >>= 8;
v = CRC_UPDATE_BYTE_2(v, (Byte)data); data >>= 8;
v = CRC_UPDATE_BYTE_2(v, (Byte)data); data >>= 8;
v = CRC_UPDATE_BYTE_2(v, (Byte)data); data >>= 8;
return v;
}
#endif // USE_CRC_EMU
#endif // defined(MY_CPU_ARM64) && defined(MY_CPU_LE)
#if defined(USE_ARM64_CRC) || defined(USE_CRC_EMU)
#define T0_32_UNROLL_BYTES (4 * 4)
#define T0_64_UNROLL_BYTES (4 * 8)
#ifndef ATTRIB_CRC
#define ATTRIB_CRC
#endif
// #pragma message("USE ARM HW CRC")
ATTRIB_CRC
UInt32 Z7_FASTCALL CrcUpdateT0_32(UInt32 v, const void *data, size_t size, const UInt32 *table);
ATTRIB_CRC
UInt32 Z7_FASTCALL CrcUpdateT0_32(UInt32 v, const void *data, size_t size, const UInt32 *table)
{
const Byte *p = (const Byte *)data;
UNUSED_VAR(table);
for (; size != 0 && ((unsigned)(ptrdiff_t)p & (T0_32_UNROLL_BYTES - 1)) != 0; size--)
v = __crc32b(v, *p++);
if (size >= T0_32_UNROLL_BYTES)
{
const Byte *lim = p + size;
size &= (T0_32_UNROLL_BYTES - 1);
lim -= size;
do
{
v = __crc32w(v, *(const UInt32 *)(const void *)(p));
v = __crc32w(v, *(const UInt32 *)(const void *)(p + 4)); p += 2 * 4;
v = __crc32w(v, *(const UInt32 *)(const void *)(p));
v = __crc32w(v, *(const UInt32 *)(const void *)(p + 4)); p += 2 * 4;
}
while (p != lim);
}
for (; size != 0; size--)
v = __crc32b(v, *p++);
return v;
}
ATTRIB_CRC
UInt32 Z7_FASTCALL CrcUpdateT0_64(UInt32 v, const void *data, size_t size, const UInt32 *table);
ATTRIB_CRC
UInt32 Z7_FASTCALL CrcUpdateT0_64(UInt32 v, const void *data, size_t size, const UInt32 *table)
{
const Byte *p = (const Byte *)data;
UNUSED_VAR(table);
for (; size != 0 && ((unsigned)(ptrdiff_t)p & (T0_64_UNROLL_BYTES - 1)) != 0; size--)
v = __crc32b(v, *p++);
if (size >= T0_64_UNROLL_BYTES)
{
const Byte *lim = p + size;
size &= (T0_64_UNROLL_BYTES - 1);
lim -= size;
do
{
v = __crc32d(v, *(const UInt64 *)(const void *)(p));
v = __crc32d(v, *(const UInt64 *)(const void *)(p + 8)); p += 2 * 8;
v = __crc32d(v, *(const UInt64 *)(const void *)(p));
v = __crc32d(v, *(const UInt64 *)(const void *)(p + 8)); p += 2 * 8;
}
while (p != lim);
}
for (; size != 0; size--)
v = __crc32b(v, *p++);
return v;
}
#undef T0_32_UNROLL_BYTES
#undef T0_64_UNROLL_BYTES
#endif // defined(USE_ARM64_CRC) || defined(USE_CRC_EMU)
#endif // MY_CPU_LE
void Z7_FASTCALL CrcGenerateTable(void)
{
UInt32 i;
for (i = 0; i < 256; i++)
{
UInt32 r = i;
int j;
unsigned j;
for (j = 0; j < 8; j++)
r = (r >> 1) ^ (kCrcPoly & ~((r & 1) - 1));
r = (r >> 1) ^ (kCrcPoly & ((UInt32)0 - (r & 1)));
g_CrcTable[i] = r;
}
for (i = 256; i < 256 * CRC_NUM_TABLES; i++)
{
const UInt32 r = g_CrcTable[(size_t)i - 256];
g_CrcTable[i] = g_CrcTable[r & 0xFF] ^ (r >> 8);
}
#if CRC_NUM_TABLES < 4
g_CrcUpdate = CrcUpdateT1;
#elif defined(MY_CPU_LE)
// g_CrcUpdateT4 = CrcUpdateT4;
#if CRC_NUM_TABLES < 8
g_CrcUpdate = CrcUpdateT4;
#else // CRC_NUM_TABLES >= 8
g_CrcUpdateT8 = CrcUpdateT8;
/*
#ifdef MY_CPU_X86_OR_AMD64
if (!CPU_Is_InOrder())
#endif
*/
g_CrcUpdate = CrcUpdateT8;
#endif
#else
{
#ifndef MY_CPU_BE
UInt32 k = 0x01020304;
const Byte *p = (const Byte *)&k;
if (p[0] == 4 && p[1] == 3)
{
#if CRC_NUM_TABLES < 8
// g_CrcUpdateT4 = CrcUpdateT4;
g_CrcUpdate = CrcUpdateT4;
#else // CRC_NUM_TABLES >= 8
g_CrcUpdateT8 = CrcUpdateT8;
g_CrcUpdate = CrcUpdateT8;
#endif
}
else if (p[0] != 1 || p[1] != 2)
g_CrcUpdate = CrcUpdateT1;
else
#endif // MY_CPU_BE
{
for (i = 256 * CRC_NUM_TABLES - 1; i >= 256; i--)
{
const UInt32 x = g_CrcTable[(size_t)i - 256];
g_CrcTable[i] = Z7_BSWAP32(x);
}
#if CRC_NUM_TABLES <= 4
g_CrcUpdate = CrcUpdateT1;
#elif CRC_NUM_TABLES <= 8
// g_CrcUpdateT4 = CrcUpdateT1_BeT4;
g_CrcUpdate = CrcUpdateT1_BeT4;
#else // CRC_NUM_TABLES > 8
g_CrcUpdateT8 = CrcUpdateT1_BeT8;
g_CrcUpdate = CrcUpdateT1_BeT8;
#endif
}
}
#endif // CRC_NUM_TABLES < 4
#ifdef MY_CPU_LE
#ifdef USE_ARM64_CRC
if (CPU_IsSupported_CRC32())
{
g_CrcUpdateT0_32 = CrcUpdateT0_32;
g_CrcUpdateT0_64 = CrcUpdateT0_64;
g_CrcUpdate =
#if defined(MY_CPU_ARM)
CrcUpdateT0_32;
#else
CrcUpdateT0_64;
#endif
}
#endif
#ifdef USE_CRC_EMU
g_CrcUpdateT0_32 = CrcUpdateT0_32;
g_CrcUpdateT0_64 = CrcUpdateT0_64;
g_CrcUpdate = CrcUpdateT0_64;
#endif
#endif
}
UInt32 MY_FAST_CALL CrcUpdate(UInt32 v, const void *data, size_t size)
{
const Byte *p = (const Byte *)data;
for (; size > 0 ; size--, p++)
v = CRC_UPDATE_BYTE(v, *p);
return v;
}
UInt32 MY_FAST_CALL CrcCalc(const void *data, size_t size)
{
return CrcUpdate(CRC_INIT_VAL, data, size) ^ 0xFFFFFFFF;
}
#undef kCrcPoly
#undef CRC64_NUM_TABLES
#undef CRC_UPDATE_BYTE_2

View File

@@ -1,24 +1,27 @@
/* 7zCrc.h -- CRC32 calculation
2008-03-13
Igor Pavlov
Public domain */
2023-04-02 : Igor Pavlov : Public domain */
#ifndef __7Z_CRC_H
#define __7Z_CRC_H
#ifndef ZIP7_INC_7Z_CRC_H
#define ZIP7_INC_7Z_CRC_H
#include <stddef.h>
#include "7zTypes.h"
#include "Types.h"
EXTERN_C_BEGIN
extern UInt32 g_CrcTable[];
void MY_FAST_CALL CrcGenerateTable(void);
/* Call CrcGenerateTable one time before other CRC functions */
void Z7_FASTCALL CrcGenerateTable(void);
#define CRC_INIT_VAL 0xFFFFFFFF
#define CRC_GET_DIGEST(crc) ((crc) ^ 0xFFFFFFFF)
#define CRC_GET_DIGEST(crc) ((crc) ^ CRC_INIT_VAL)
#define CRC_UPDATE_BYTE(crc, b) (g_CrcTable[((crc) ^ (b)) & 0xFF] ^ ((crc) >> 8))
UInt32 MY_FAST_CALL CrcUpdate(UInt32 crc, const void *data, size_t size);
UInt32 MY_FAST_CALL CrcCalc(const void *data, size_t size);
UInt32 Z7_FASTCALL CrcUpdate(UInt32 crc, const void *data, size_t size);
UInt32 Z7_FASTCALL CrcCalc(const void *data, size_t size);
typedef UInt32 (Z7_FASTCALL *CRC_FUNC)(UInt32 v, const void *data, size_t size, const UInt32 *table);
EXTERN_C_END
#endif

117
C/7zCrcOpt.c Executable file
View File

@@ -0,0 +1,117 @@
/* 7zCrcOpt.c -- CRC32 calculation
2023-04-02 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include "CpuArch.h"
#ifndef MY_CPU_BE
#define CRC_UPDATE_BYTE_2(crc, b) (table[((crc) ^ (b)) & 0xFF] ^ ((crc) >> 8))
UInt32 Z7_FASTCALL CrcUpdateT4(UInt32 v, const void *data, size_t size, const UInt32 *table);
UInt32 Z7_FASTCALL CrcUpdateT4(UInt32 v, const void *data, size_t size, const UInt32 *table)
{
const Byte *p = (const Byte *)data;
for (; size > 0 && ((unsigned)(ptrdiff_t)p & 3) != 0; size--, p++)
v = CRC_UPDATE_BYTE_2(v, *p);
for (; size >= 4; size -= 4, p += 4)
{
v ^= *(const UInt32 *)(const void *)p;
v =
(table + 0x300)[((v ) & 0xFF)]
^ (table + 0x200)[((v >> 8) & 0xFF)]
^ (table + 0x100)[((v >> 16) & 0xFF)]
^ (table + 0x000)[((v >> 24))];
}
for (; size > 0; size--, p++)
v = CRC_UPDATE_BYTE_2(v, *p);
return v;
}
UInt32 Z7_FASTCALL CrcUpdateT8(UInt32 v, const void *data, size_t size, const UInt32 *table);
UInt32 Z7_FASTCALL CrcUpdateT8(UInt32 v, const void *data, size_t size, const UInt32 *table)
{
const Byte *p = (const Byte *)data;
for (; size > 0 && ((unsigned)(ptrdiff_t)p & 7) != 0; size--, p++)
v = CRC_UPDATE_BYTE_2(v, *p);
for (; size >= 8; size -= 8, p += 8)
{
UInt32 d;
v ^= *(const UInt32 *)(const void *)p;
v =
(table + 0x700)[((v ) & 0xFF)]
^ (table + 0x600)[((v >> 8) & 0xFF)]
^ (table + 0x500)[((v >> 16) & 0xFF)]
^ (table + 0x400)[((v >> 24))];
d = *((const UInt32 *)(const void *)p + 1);
v ^=
(table + 0x300)[((d ) & 0xFF)]
^ (table + 0x200)[((d >> 8) & 0xFF)]
^ (table + 0x100)[((d >> 16) & 0xFF)]
^ (table + 0x000)[((d >> 24))];
}
for (; size > 0; size--, p++)
v = CRC_UPDATE_BYTE_2(v, *p);
return v;
}
#endif
#ifndef MY_CPU_LE
#define CRC_UINT32_SWAP(v) Z7_BSWAP32(v)
#define CRC_UPDATE_BYTE_2_BE(crc, b) (table[(((crc) >> 24) ^ (b))] ^ ((crc) << 8))
UInt32 Z7_FASTCALL CrcUpdateT1_BeT4(UInt32 v, const void *data, size_t size, const UInt32 *table)
{
const Byte *p = (const Byte *)data;
table += 0x100;
v = CRC_UINT32_SWAP(v);
for (; size > 0 && ((unsigned)(ptrdiff_t)p & 3) != 0; size--, p++)
v = CRC_UPDATE_BYTE_2_BE(v, *p);
for (; size >= 4; size -= 4, p += 4)
{
v ^= *(const UInt32 *)(const void *)p;
v =
(table + 0x000)[((v ) & 0xFF)]
^ (table + 0x100)[((v >> 8) & 0xFF)]
^ (table + 0x200)[((v >> 16) & 0xFF)]
^ (table + 0x300)[((v >> 24))];
}
for (; size > 0; size--, p++)
v = CRC_UPDATE_BYTE_2_BE(v, *p);
return CRC_UINT32_SWAP(v);
}
UInt32 Z7_FASTCALL CrcUpdateT1_BeT8(UInt32 v, const void *data, size_t size, const UInt32 *table)
{
const Byte *p = (const Byte *)data;
table += 0x100;
v = CRC_UINT32_SWAP(v);
for (; size > 0 && ((unsigned)(ptrdiff_t)p & 7) != 0; size--, p++)
v = CRC_UPDATE_BYTE_2_BE(v, *p);
for (; size >= 8; size -= 8, p += 8)
{
UInt32 d;
v ^= *(const UInt32 *)(const void *)p;
v =
(table + 0x400)[((v ) & 0xFF)]
^ (table + 0x500)[((v >> 8) & 0xFF)]
^ (table + 0x600)[((v >> 16) & 0xFF)]
^ (table + 0x700)[((v >> 24))];
d = *((const UInt32 *)(const void *)p + 1);
v ^=
(table + 0x000)[((d ) & 0xFF)]
^ (table + 0x100)[((d >> 8) & 0xFF)]
^ (table + 0x200)[((d >> 16) & 0xFF)]
^ (table + 0x300)[((d >> 24))];
}
for (; size > 0; size--, p++)
v = CRC_UPDATE_BYTE_2_BE(v, *p);
return CRC_UINT32_SWAP(v);
}
#endif

View File

@@ -1,43 +0,0 @@
/* 7zCrcT8.c -- CRC32 calculation with 8 tables
2008-03-19
Igor Pavlov
Public domain */
#include "7zCrc.h"
#define kCrcPoly 0xEDB88320
#define CRC_NUM_TABLES 8
UInt32 g_CrcTable[256 * CRC_NUM_TABLES];
void MY_FAST_CALL CrcGenerateTable()
{
UInt32 i;
for (i = 0; i < 256; i++)
{
UInt32 r = i;
int j;
for (j = 0; j < 8; j++)
r = (r >> 1) ^ (kCrcPoly & ~((r & 1) - 1));
g_CrcTable[i] = r;
}
#if CRC_NUM_TABLES > 1
for (; i < 256 * CRC_NUM_TABLES; i++)
{
UInt32 r = g_CrcTable[i - 256];
g_CrcTable[i] = g_CrcTable[r & 0xFF] ^ (r >> 8);
}
#endif
}
UInt32 MY_FAST_CALL CrcUpdateT8(UInt32 v, const void *data, size_t size, const UInt32 *table);
UInt32 MY_FAST_CALL CrcUpdate(UInt32 v, const void *data, size_t size)
{
return CrcUpdateT8(v, data, size, g_CrcTable);
}
UInt32 MY_FAST_CALL CrcCalc(const void *data, size_t size)
{
return CrcUpdateT8(CRC_INIT_VAL, data, size, g_CrcTable) ^ 0xFFFFFFFF;
}

648
C/7zDec.c Executable file
View File

@@ -0,0 +1,648 @@
/* 7zDec.c -- Decoding from 7z folder
2023-04-02 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include <string.h>
/* #define Z7_PPMD_SUPPORT */
#include "7z.h"
#include "7zCrc.h"
#include "Bcj2.h"
#include "Bra.h"
#include "CpuArch.h"
#include "Delta.h"
#include "LzmaDec.h"
#include "Lzma2Dec.h"
#ifdef Z7_PPMD_SUPPORT
#include "Ppmd7.h"
#endif
#define k_Copy 0
#ifndef Z7_NO_METHOD_LZMA2
#define k_LZMA2 0x21
#endif
#define k_LZMA 0x30101
#define k_BCJ2 0x303011B
#if !defined(Z7_NO_METHODS_FILTERS)
#define Z7_USE_BRANCH_FILTER
#endif
#if !defined(Z7_NO_METHODS_FILTERS) || \
defined(Z7_USE_NATIVE_BRANCH_FILTER) && defined(MY_CPU_ARM64)
#define Z7_USE_FILTER_ARM64
#ifndef Z7_USE_BRANCH_FILTER
#define Z7_USE_BRANCH_FILTER
#endif
#define k_ARM64 0xa
#endif
#if !defined(Z7_NO_METHODS_FILTERS) || \
defined(Z7_USE_NATIVE_BRANCH_FILTER) && defined(MY_CPU_ARMT)
#define Z7_USE_FILTER_ARMT
#ifndef Z7_USE_BRANCH_FILTER
#define Z7_USE_BRANCH_FILTER
#endif
#define k_ARMT 0x3030701
#endif
#ifndef Z7_NO_METHODS_FILTERS
#define k_Delta 3
#define k_BCJ 0x3030103
#define k_PPC 0x3030205
#define k_IA64 0x3030401
#define k_ARM 0x3030501
#define k_SPARC 0x3030805
#endif
#ifdef Z7_PPMD_SUPPORT
#define k_PPMD 0x30401
typedef struct
{
IByteIn vt;
const Byte *cur;
const Byte *end;
const Byte *begin;
UInt64 processed;
BoolInt extra;
SRes res;
ILookInStreamPtr inStream;
} CByteInToLook;
static Byte ReadByte(IByteInPtr pp)
{
Z7_CONTAINER_FROM_VTBL_TO_DECL_VAR_pp_vt_p(CByteInToLook)
if (p->cur != p->end)
return *p->cur++;
if (p->res == SZ_OK)
{
size_t size = (size_t)(p->cur - p->begin);
p->processed += size;
p->res = ILookInStream_Skip(p->inStream, size);
size = (1 << 25);
p->res = ILookInStream_Look(p->inStream, (const void **)&p->begin, &size);
p->cur = p->begin;
p->end = p->begin + size;
if (size != 0)
return *p->cur++;
}
p->extra = True;
return 0;
}
static SRes SzDecodePpmd(const Byte *props, unsigned propsSize, UInt64 inSize, ILookInStreamPtr inStream,
Byte *outBuffer, SizeT outSize, ISzAllocPtr allocMain)
{
CPpmd7 ppmd;
CByteInToLook s;
SRes res = SZ_OK;
s.vt.Read = ReadByte;
s.inStream = inStream;
s.begin = s.end = s.cur = NULL;
s.extra = False;
s.res = SZ_OK;
s.processed = 0;
if (propsSize != 5)
return SZ_ERROR_UNSUPPORTED;
{
unsigned order = props[0];
UInt32 memSize = GetUi32(props + 1);
if (order < PPMD7_MIN_ORDER ||
order > PPMD7_MAX_ORDER ||
memSize < PPMD7_MIN_MEM_SIZE ||
memSize > PPMD7_MAX_MEM_SIZE)
return SZ_ERROR_UNSUPPORTED;
Ppmd7_Construct(&ppmd);
if (!Ppmd7_Alloc(&ppmd, memSize, allocMain))
return SZ_ERROR_MEM;
Ppmd7_Init(&ppmd, order);
}
{
ppmd.rc.dec.Stream = &s.vt;
if (!Ppmd7z_RangeDec_Init(&ppmd.rc.dec))
res = SZ_ERROR_DATA;
else if (!s.extra)
{
Byte *buf = outBuffer;
const Byte *lim = buf + outSize;
for (; buf != lim; buf++)
{
int sym = Ppmd7z_DecodeSymbol(&ppmd);
if (s.extra || sym < 0)
break;
*buf = (Byte)sym;
}
if (buf != lim)
res = SZ_ERROR_DATA;
else if (!Ppmd7z_RangeDec_IsFinishedOK(&ppmd.rc.dec))
{
/* if (Ppmd7z_DecodeSymbol(&ppmd) != PPMD7_SYM_END || !Ppmd7z_RangeDec_IsFinishedOK(&ppmd.rc.dec)) */
res = SZ_ERROR_DATA;
}
}
if (s.extra)
res = (s.res != SZ_OK ? s.res : SZ_ERROR_DATA);
else if (s.processed + (size_t)(s.cur - s.begin) != inSize)
res = SZ_ERROR_DATA;
}
Ppmd7_Free(&ppmd, allocMain);
return res;
}
#endif
static SRes SzDecodeLzma(const Byte *props, unsigned propsSize, UInt64 inSize, ILookInStreamPtr inStream,
Byte *outBuffer, SizeT outSize, ISzAllocPtr allocMain)
{
CLzmaDec state;
SRes res = SZ_OK;
LzmaDec_CONSTRUCT(&state)
RINOK(LzmaDec_AllocateProbs(&state, props, propsSize, allocMain))
state.dic = outBuffer;
state.dicBufSize = outSize;
LzmaDec_Init(&state);
for (;;)
{
const void *inBuf = NULL;
size_t lookahead = (1 << 18);
if (lookahead > inSize)
lookahead = (size_t)inSize;
res = ILookInStream_Look(inStream, &inBuf, &lookahead);
if (res != SZ_OK)
break;
{
SizeT inProcessed = (SizeT)lookahead, dicPos = state.dicPos;
ELzmaStatus status;
res = LzmaDec_DecodeToDic(&state, outSize, (const Byte *)inBuf, &inProcessed, LZMA_FINISH_END, &status);
lookahead -= inProcessed;
inSize -= inProcessed;
if (res != SZ_OK)
break;
if (status == LZMA_STATUS_FINISHED_WITH_MARK)
{
if (outSize != state.dicPos || inSize != 0)
res = SZ_ERROR_DATA;
break;
}
if (outSize == state.dicPos && inSize == 0 && status == LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK)
break;
if (inProcessed == 0 && dicPos == state.dicPos)
{
res = SZ_ERROR_DATA;
break;
}
res = ILookInStream_Skip(inStream, inProcessed);
if (res != SZ_OK)
break;
}
}
LzmaDec_FreeProbs(&state, allocMain);
return res;
}
#ifndef Z7_NO_METHOD_LZMA2
static SRes SzDecodeLzma2(const Byte *props, unsigned propsSize, UInt64 inSize, ILookInStreamPtr inStream,
Byte *outBuffer, SizeT outSize, ISzAllocPtr allocMain)
{
CLzma2Dec state;
SRes res = SZ_OK;
Lzma2Dec_CONSTRUCT(&state)
if (propsSize != 1)
return SZ_ERROR_DATA;
RINOK(Lzma2Dec_AllocateProbs(&state, props[0], allocMain))
state.decoder.dic = outBuffer;
state.decoder.dicBufSize = outSize;
Lzma2Dec_Init(&state);
for (;;)
{
const void *inBuf = NULL;
size_t lookahead = (1 << 18);
if (lookahead > inSize)
lookahead = (size_t)inSize;
res = ILookInStream_Look(inStream, &inBuf, &lookahead);
if (res != SZ_OK)
break;
{
SizeT inProcessed = (SizeT)lookahead, dicPos = state.decoder.dicPos;
ELzmaStatus status;
res = Lzma2Dec_DecodeToDic(&state, outSize, (const Byte *)inBuf, &inProcessed, LZMA_FINISH_END, &status);
lookahead -= inProcessed;
inSize -= inProcessed;
if (res != SZ_OK)
break;
if (status == LZMA_STATUS_FINISHED_WITH_MARK)
{
if (outSize != state.decoder.dicPos || inSize != 0)
res = SZ_ERROR_DATA;
break;
}
if (inProcessed == 0 && dicPos == state.decoder.dicPos)
{
res = SZ_ERROR_DATA;
break;
}
res = ILookInStream_Skip(inStream, inProcessed);
if (res != SZ_OK)
break;
}
}
Lzma2Dec_FreeProbs(&state, allocMain);
return res;
}
#endif
static SRes SzDecodeCopy(UInt64 inSize, ILookInStreamPtr inStream, Byte *outBuffer)
{
while (inSize > 0)
{
const void *inBuf;
size_t curSize = (1 << 18);
if (curSize > inSize)
curSize = (size_t)inSize;
RINOK(ILookInStream_Look(inStream, &inBuf, &curSize))
if (curSize == 0)
return SZ_ERROR_INPUT_EOF;
memcpy(outBuffer, inBuf, curSize);
outBuffer += curSize;
inSize -= curSize;
RINOK(ILookInStream_Skip(inStream, curSize))
}
return SZ_OK;
}
static BoolInt IS_MAIN_METHOD(UInt32 m)
{
switch (m)
{
case k_Copy:
case k_LZMA:
#ifndef Z7_NO_METHOD_LZMA2
case k_LZMA2:
#endif
#ifdef Z7_PPMD_SUPPORT
case k_PPMD:
#endif
return True;
}
return False;
}
static BoolInt IS_SUPPORTED_CODER(const CSzCoderInfo *c)
{
return
c->NumStreams == 1
/* && c->MethodID <= (UInt32)0xFFFFFFFF */
&& IS_MAIN_METHOD((UInt32)c->MethodID);
}
#define IS_BCJ2(c) ((c)->MethodID == k_BCJ2 && (c)->NumStreams == 4)
static SRes CheckSupportedFolder(const CSzFolder *f)
{
if (f->NumCoders < 1 || f->NumCoders > 4)
return SZ_ERROR_UNSUPPORTED;
if (!IS_SUPPORTED_CODER(&f->Coders[0]))
return SZ_ERROR_UNSUPPORTED;
if (f->NumCoders == 1)
{
if (f->NumPackStreams != 1 || f->PackStreams[0] != 0 || f->NumBonds != 0)
return SZ_ERROR_UNSUPPORTED;
return SZ_OK;
}
#if defined(Z7_USE_BRANCH_FILTER)
if (f->NumCoders == 2)
{
const CSzCoderInfo *c = &f->Coders[1];
if (
/* c->MethodID > (UInt32)0xFFFFFFFF || */
c->NumStreams != 1
|| f->NumPackStreams != 1
|| f->PackStreams[0] != 0
|| f->NumBonds != 1
|| f->Bonds[0].InIndex != 1
|| f->Bonds[0].OutIndex != 0)
return SZ_ERROR_UNSUPPORTED;
switch ((UInt32)c->MethodID)
{
#if !defined(Z7_NO_METHODS_FILTERS)
case k_Delta:
case k_BCJ:
case k_PPC:
case k_IA64:
case k_SPARC:
case k_ARM:
#endif
#ifdef Z7_USE_FILTER_ARM64
case k_ARM64:
#endif
#ifdef Z7_USE_FILTER_ARMT
case k_ARMT:
#endif
break;
default:
return SZ_ERROR_UNSUPPORTED;
}
return SZ_OK;
}
#endif
if (f->NumCoders == 4)
{
if (!IS_SUPPORTED_CODER(&f->Coders[1])
|| !IS_SUPPORTED_CODER(&f->Coders[2])
|| !IS_BCJ2(&f->Coders[3]))
return SZ_ERROR_UNSUPPORTED;
if (f->NumPackStreams != 4
|| f->PackStreams[0] != 2
|| f->PackStreams[1] != 6
|| f->PackStreams[2] != 1
|| f->PackStreams[3] != 0
|| f->NumBonds != 3
|| f->Bonds[0].InIndex != 5 || f->Bonds[0].OutIndex != 0
|| f->Bonds[1].InIndex != 4 || f->Bonds[1].OutIndex != 1
|| f->Bonds[2].InIndex != 3 || f->Bonds[2].OutIndex != 2)
return SZ_ERROR_UNSUPPORTED;
return SZ_OK;
}
return SZ_ERROR_UNSUPPORTED;
}
static SRes SzFolder_Decode2(const CSzFolder *folder,
const Byte *propsData,
const UInt64 *unpackSizes,
const UInt64 *packPositions,
ILookInStreamPtr inStream, UInt64 startPos,
Byte *outBuffer, SizeT outSize, ISzAllocPtr allocMain,
Byte *tempBuf[])
{
UInt32 ci;
SizeT tempSizes[3] = { 0, 0, 0};
SizeT tempSize3 = 0;
Byte *tempBuf3 = 0;
RINOK(CheckSupportedFolder(folder))
for (ci = 0; ci < folder->NumCoders; ci++)
{
const CSzCoderInfo *coder = &folder->Coders[ci];
if (IS_MAIN_METHOD((UInt32)coder->MethodID))
{
UInt32 si = 0;
UInt64 offset;
UInt64 inSize;
Byte *outBufCur = outBuffer;
SizeT outSizeCur = outSize;
if (folder->NumCoders == 4)
{
const UInt32 indices[] = { 3, 2, 0 };
const UInt64 unpackSize = unpackSizes[ci];
si = indices[ci];
if (ci < 2)
{
Byte *temp;
outSizeCur = (SizeT)unpackSize;
if (outSizeCur != unpackSize)
return SZ_ERROR_MEM;
temp = (Byte *)ISzAlloc_Alloc(allocMain, outSizeCur);
if (!temp && outSizeCur != 0)
return SZ_ERROR_MEM;
outBufCur = tempBuf[1 - ci] = temp;
tempSizes[1 - ci] = outSizeCur;
}
else if (ci == 2)
{
if (unpackSize > outSize) /* check it */
return SZ_ERROR_PARAM;
tempBuf3 = outBufCur = outBuffer + (outSize - (size_t)unpackSize);
tempSize3 = outSizeCur = (SizeT)unpackSize;
}
else
return SZ_ERROR_UNSUPPORTED;
}
offset = packPositions[si];
inSize = packPositions[(size_t)si + 1] - offset;
RINOK(LookInStream_SeekTo(inStream, startPos + offset))
if (coder->MethodID == k_Copy)
{
if (inSize != outSizeCur) /* check it */
return SZ_ERROR_DATA;
RINOK(SzDecodeCopy(inSize, inStream, outBufCur))
}
else if (coder->MethodID == k_LZMA)
{
RINOK(SzDecodeLzma(propsData + coder->PropsOffset, coder->PropsSize, inSize, inStream, outBufCur, outSizeCur, allocMain))
}
#ifndef Z7_NO_METHOD_LZMA2
else if (coder->MethodID == k_LZMA2)
{
RINOK(SzDecodeLzma2(propsData + coder->PropsOffset, coder->PropsSize, inSize, inStream, outBufCur, outSizeCur, allocMain))
}
#endif
#ifdef Z7_PPMD_SUPPORT
else if (coder->MethodID == k_PPMD)
{
RINOK(SzDecodePpmd(propsData + coder->PropsOffset, coder->PropsSize, inSize, inStream, outBufCur, outSizeCur, allocMain))
}
#endif
else
return SZ_ERROR_UNSUPPORTED;
}
else if (coder->MethodID == k_BCJ2)
{
const UInt64 offset = packPositions[1];
const UInt64 s3Size = packPositions[2] - offset;
if (ci != 3)
return SZ_ERROR_UNSUPPORTED;
tempSizes[2] = (SizeT)s3Size;
if (tempSizes[2] != s3Size)
return SZ_ERROR_MEM;
tempBuf[2] = (Byte *)ISzAlloc_Alloc(allocMain, tempSizes[2]);
if (!tempBuf[2] && tempSizes[2] != 0)
return SZ_ERROR_MEM;
RINOK(LookInStream_SeekTo(inStream, startPos + offset))
RINOK(SzDecodeCopy(s3Size, inStream, tempBuf[2]))
if ((tempSizes[0] & 3) != 0 ||
(tempSizes[1] & 3) != 0 ||
tempSize3 + tempSizes[0] + tempSizes[1] != outSize)
return SZ_ERROR_DATA;
{
CBcj2Dec p;
p.bufs[0] = tempBuf3; p.lims[0] = tempBuf3 + tempSize3;
p.bufs[1] = tempBuf[0]; p.lims[1] = tempBuf[0] + tempSizes[0];
p.bufs[2] = tempBuf[1]; p.lims[2] = tempBuf[1] + tempSizes[1];
p.bufs[3] = tempBuf[2]; p.lims[3] = tempBuf[2] + tempSizes[2];
p.dest = outBuffer;
p.destLim = outBuffer + outSize;
Bcj2Dec_Init(&p);
RINOK(Bcj2Dec_Decode(&p))
{
unsigned i;
for (i = 0; i < 4; i++)
if (p.bufs[i] != p.lims[i])
return SZ_ERROR_DATA;
if (p.dest != p.destLim || !Bcj2Dec_IsMaybeFinished(&p))
return SZ_ERROR_DATA;
}
}
}
#if defined(Z7_USE_BRANCH_FILTER)
else if (ci == 1)
{
#if !defined(Z7_NO_METHODS_FILTERS)
if (coder->MethodID == k_Delta)
{
if (coder->PropsSize != 1)
return SZ_ERROR_UNSUPPORTED;
{
Byte state[DELTA_STATE_SIZE];
Delta_Init(state);
Delta_Decode(state, (unsigned)(propsData[coder->PropsOffset]) + 1, outBuffer, outSize);
}
continue;
}
#endif
#ifdef Z7_USE_FILTER_ARM64
if (coder->MethodID == k_ARM64)
{
UInt32 pc = 0;
if (coder->PropsSize == 4)
pc = GetUi32(propsData + coder->PropsOffset);
else if (coder->PropsSize != 0)
return SZ_ERROR_UNSUPPORTED;
z7_BranchConv_ARM64_Dec(outBuffer, outSize, pc);
continue;
}
#endif
#if !defined(Z7_NO_METHODS_FILTERS) || defined(Z7_USE_FILTER_ARMT)
{
if (coder->PropsSize != 0)
return SZ_ERROR_UNSUPPORTED;
#define CASE_BRA_CONV(isa) case k_ ## isa: Z7_BRANCH_CONV_DEC(isa)(outBuffer, outSize, 0); break; // pc = 0;
switch (coder->MethodID)
{
#if !defined(Z7_NO_METHODS_FILTERS)
case k_BCJ:
{
UInt32 state = Z7_BRANCH_CONV_ST_X86_STATE_INIT_VAL;
z7_BranchConvSt_X86_Dec(outBuffer, outSize, 0, &state); // pc = 0
break;
}
CASE_BRA_CONV(PPC)
CASE_BRA_CONV(IA64)
CASE_BRA_CONV(SPARC)
CASE_BRA_CONV(ARM)
#endif
#if !defined(Z7_NO_METHODS_FILTERS) || defined(Z7_USE_FILTER_ARMT)
CASE_BRA_CONV(ARMT)
#endif
default:
return SZ_ERROR_UNSUPPORTED;
}
continue;
}
#endif
} // (c == 1)
#endif
else
return SZ_ERROR_UNSUPPORTED;
}
return SZ_OK;
}
SRes SzAr_DecodeFolder(const CSzAr *p, UInt32 folderIndex,
ILookInStreamPtr inStream, UInt64 startPos,
Byte *outBuffer, size_t outSize,
ISzAllocPtr allocMain)
{
SRes res;
CSzFolder folder;
CSzData sd;
const Byte *data = p->CodersData + p->FoCodersOffsets[folderIndex];
sd.Data = data;
sd.Size = p->FoCodersOffsets[(size_t)folderIndex + 1] - p->FoCodersOffsets[folderIndex];
res = SzGetNextFolderItem(&folder, &sd);
if (res != SZ_OK)
return res;
if (sd.Size != 0
|| folder.UnpackStream != p->FoToMainUnpackSizeIndex[folderIndex]
|| outSize != SzAr_GetFolderUnpackSize(p, folderIndex))
return SZ_ERROR_FAIL;
{
unsigned i;
Byte *tempBuf[3] = { 0, 0, 0};
res = SzFolder_Decode2(&folder, data,
&p->CoderUnpackSizes[p->FoToCoderUnpackSizes[folderIndex]],
p->PackPositions + p->FoStartPackStreamIndex[folderIndex],
inStream, startPos,
outBuffer, (SizeT)outSize, allocMain, tempBuf);
for (i = 0; i < 3; i++)
ISzAlloc_Free(allocMain, tempBuf[i]);
if (res == SZ_OK)
if (SzBitWithVals_Check(&p->FolderCRCs, folderIndex))
if (CrcCalc(outBuffer, outSize) != p->FolderCRCs.Vals[folderIndex])
res = SZ_ERROR_CRC;
return res;
}
}

View File

@@ -1,15 +1,27 @@
/* 7zFile.c -- File IO
2008-11-22 : Igor Pavlov : Public domain */
2023-04-02 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include "7zFile.h"
#ifndef USE_WINDOWS_FILE
#include <errno.h>
#include <errno.h>
#endif
#ifndef USE_FOPEN
#include <stdio.h>
#include <fcntl.h>
#ifdef _WIN32
#include <io.h>
typedef int ssize_t;
typedef int off_t;
#else
#include <unistd.h>
#endif
#endif
#ifdef USE_WINDOWS_FILE
#else
/*
ReadFile and WriteFile functions in Windows have BUG:
@@ -21,108 +33,206 @@
And message can be "Network connection was lost"
*/
#define kChunkSizeMax (1 << 22)
#endif
#define kChunkSizeMax (1 << 22)
void File_Construct(CSzFile *p)
{
#ifdef USE_WINDOWS_FILE
p->handle = INVALID_HANDLE_VALUE;
#else
#elif defined(USE_FOPEN)
p->file = NULL;
#else
p->fd = -1;
#endif
}
#if !defined(UNDER_CE) || !defined(USE_WINDOWS_FILE)
static WRes File_Open(CSzFile *p, const char *name, int writeMode)
{
#ifdef USE_WINDOWS_FILE
p->handle = CreateFileA(name,
writeMode ? GENERIC_WRITE : GENERIC_READ,
FILE_SHARE_READ, NULL,
writeMode ? CREATE_ALWAYS : OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL, NULL);
return (p->handle != INVALID_HANDLE_VALUE) ? 0 : GetLastError();
#else
#elif defined(USE_FOPEN)
p->file = fopen(name, writeMode ? "wb+" : "rb");
return (p->file != 0) ? 0 : errno;
return (p->file != 0) ? 0 :
#ifdef UNDER_CE
2; /* ENOENT */
#else
errno;
#endif
#else
int flags = (writeMode ? (O_CREAT | O_EXCL | O_WRONLY) : O_RDONLY);
#ifdef O_BINARY
flags |= O_BINARY;
#endif
p->fd = open(name, flags, 0666);
return (p->fd != -1) ? 0 : errno;
#endif
}
WRes InFile_Open(CSzFile *p, const char *name) { return File_Open(p, name, 0); }
WRes OutFile_Open(CSzFile *p, const char *name) { return File_Open(p, name, 1); }
WRes OutFile_Open(CSzFile *p, const char *name)
{
#if defined(USE_WINDOWS_FILE) || defined(USE_FOPEN)
return File_Open(p, name, 1);
#else
p->fd = creat(name, 0666);
return (p->fd != -1) ? 0 : errno;
#endif
}
#endif
#ifdef USE_WINDOWS_FILE
static WRes File_OpenW(CSzFile *p, const WCHAR *name, int writeMode)
{
p->handle = CreateFileW(name,
writeMode ? GENERIC_WRITE : GENERIC_READ,
FILE_SHARE_READ, NULL,
writeMode ? CREATE_ALWAYS : OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL, NULL);
return (p->handle != INVALID_HANDLE_VALUE) ? 0 : GetLastError();
}
WRes InFile_OpenW(CSzFile *p, const WCHAR *name) { return File_OpenW(p, name, 0); }
WRes OutFile_OpenW(CSzFile *p, const WCHAR *name) { return File_OpenW(p, name, 1); }
#endif
WRes File_Close(CSzFile *p)
{
#ifdef USE_WINDOWS_FILE
if (p->handle != INVALID_HANDLE_VALUE)
{
if (!CloseHandle(p->handle))
return GetLastError();
p->handle = INVALID_HANDLE_VALUE;
}
#else
#elif defined(USE_FOPEN)
if (p->file != NULL)
{
int res = fclose(p->file);
if (res != 0)
{
if (res == EOF)
return errno;
return res;
}
p->file = NULL;
}
#else
if (p->fd != -1)
{
if (close(p->fd) != 0)
return errno;
p->fd = -1;
}
#endif
return 0;
}
WRes File_Read(CSzFile *p, void *data, size_t *size)
{
size_t originalSize = *size;
*size = 0;
if (originalSize == 0)
return 0;
#ifdef USE_WINDOWS_FILE
*size = 0;
do
{
DWORD curSize = (originalSize > kChunkSizeMax) ? kChunkSizeMax : (DWORD)originalSize;
const DWORD curSize = (originalSize > kChunkSizeMax) ? kChunkSizeMax : (DWORD)originalSize;
DWORD processed = 0;
BOOL res = ReadFile(p->handle, data, curSize, &processed, NULL);
const BOOL res = ReadFile(p->handle, data, curSize, &processed, NULL);
data = (void *)((Byte *)data + processed);
originalSize -= processed;
*size += processed;
if (!res)
return GetLastError();
// debug : we can break here for partial reading mode
if (processed == 0)
break;
}
while (originalSize > 0);
#elif defined(USE_FOPEN)
do
{
const size_t curSize = (originalSize > kChunkSizeMax) ? kChunkSizeMax : originalSize;
const size_t processed = fread(data, 1, curSize, p->file);
data = (void *)((Byte *)data + (size_t)processed);
originalSize -= processed;
*size += processed;
if (processed != curSize)
return ferror(p->file);
// debug : we can break here for partial reading mode
if (processed == 0)
break;
}
while (originalSize > 0);
return 0;
#else
*size = fread(data, 1, originalSize, p->file);
if (*size == originalSize)
return 0;
return ferror(p->file);
do
{
const size_t curSize = (originalSize > kChunkSizeMax) ? kChunkSizeMax : originalSize;
const ssize_t processed = read(p->fd, data, curSize);
if (processed == -1)
return errno;
if (processed == 0)
break;
data = (void *)((Byte *)data + (size_t)processed);
originalSize -= (size_t)processed;
*size += (size_t)processed;
// debug : we can break here for partial reading mode
// break;
}
while (originalSize > 0);
#endif
return 0;
}
WRes File_Write(CSzFile *p, const void *data, size_t *size)
{
size_t originalSize = *size;
*size = 0;
if (originalSize == 0)
return 0;
#ifdef USE_WINDOWS_FILE
*size = 0;
do
{
DWORD curSize = (originalSize > kChunkSizeMax) ? kChunkSizeMax : (DWORD)originalSize;
const DWORD curSize = (originalSize > kChunkSizeMax) ? kChunkSizeMax : (DWORD)originalSize;
DWORD processed = 0;
BOOL res = WriteFile(p->handle, data, curSize, &processed, NULL);
data = (void *)((Byte *)data + processed);
const BOOL res = WriteFile(p->handle, data, curSize, &processed, NULL);
data = (const void *)((const Byte *)data + processed);
originalSize -= processed;
*size += processed;
if (!res)
@@ -131,61 +241,106 @@ WRes File_Write(CSzFile *p, const void *data, size_t *size)
break;
}
while (originalSize > 0);
return 0;
#elif defined(USE_FOPEN)
do
{
const size_t curSize = (originalSize > kChunkSizeMax) ? kChunkSizeMax : originalSize;
const size_t processed = fwrite(data, 1, curSize, p->file);
data = (void *)((Byte *)data + (size_t)processed);
originalSize -= processed;
*size += processed;
if (processed != curSize)
return ferror(p->file);
if (processed == 0)
break;
}
while (originalSize > 0);
#else
*size = fwrite(data, 1, originalSize, p->file);
if (*size == originalSize)
return 0;
return ferror(p->file);
do
{
const size_t curSize = (originalSize > kChunkSizeMax) ? kChunkSizeMax : originalSize;
const ssize_t processed = write(p->fd, data, curSize);
if (processed == -1)
return errno;
if (processed == 0)
break;
data = (const void *)((const Byte *)data + (size_t)processed);
originalSize -= (size_t)processed;
*size += (size_t)processed;
}
while (originalSize > 0);
#endif
return 0;
}
WRes File_Seek(CSzFile *p, Int64 *pos, ESzSeek origin)
{
#ifdef USE_WINDOWS_FILE
LARGE_INTEGER value;
DWORD moveMethod;
value.LowPart = (DWORD)*pos;
value.HighPart = (LONG)((UInt64)*pos >> 16 >> 16); /* for case when UInt64 is 32-bit only */
switch (origin)
UInt32 low = (UInt32)*pos;
LONG high = (LONG)((UInt64)*pos >> 16 >> 16); /* for case when UInt64 is 32-bit only */
// (int) to eliminate clang warning
switch ((int)origin)
{
case SZ_SEEK_SET: moveMethod = FILE_BEGIN; break;
case SZ_SEEK_CUR: moveMethod = FILE_CURRENT; break;
case SZ_SEEK_END: moveMethod = FILE_END; break;
default: return ERROR_INVALID_PARAMETER;
}
value.LowPart = SetFilePointer(p->handle, value.LowPart, &value.HighPart, moveMethod);
if (value.LowPart == 0xFFFFFFFF)
low = SetFilePointer(p->handle, (LONG)low, &high, moveMethod);
if (low == (UInt32)0xFFFFFFFF)
{
WRes res = GetLastError();
if (res != NO_ERROR)
return res;
}
*pos = ((Int64)value.HighPart << 32) | value.LowPart;
*pos = ((Int64)high << 32) | low;
return 0;
#else
int moveMethod;
int res;
switch (origin)
int moveMethod; // = origin;
switch ((int)origin)
{
case SZ_SEEK_SET: moveMethod = SEEK_SET; break;
case SZ_SEEK_CUR: moveMethod = SEEK_CUR; break;
case SZ_SEEK_END: moveMethod = SEEK_END; break;
default: return 1;
default: return EINVAL;
}
res = fseek(p->file, (long)*pos, moveMethod);
*pos = ftell(p->file);
return res;
#endif
#if defined(USE_FOPEN)
{
int res = fseek(p->file, (long)*pos, moveMethod);
if (res == -1)
return errno;
*pos = ftell(p->file);
if (*pos == -1)
return errno;
return 0;
}
#else
{
off_t res = lseek(p->fd, (off_t)*pos, moveMethod);
if (res == -1)
return errno;
*pos = res;
return 0;
}
#endif // USE_FOPEN
#endif // USE_WINDOWS_FILE
}
WRes File_GetLength(CSzFile *p, UInt64 *length)
{
#ifdef USE_WINDOWS_FILE
@@ -201,13 +356,31 @@ WRes File_GetLength(CSzFile *p, UInt64 *length)
*length = (((UInt64)sizeHigh) << 32) + sizeLow;
return 0;
#else
#elif defined(USE_FOPEN)
long pos = ftell(p->file);
int res = fseek(p->file, 0, SEEK_END);
*length = ftell(p->file);
fseek(p->file, pos, SEEK_SET);
return res;
#else
off_t pos;
*length = 0;
pos = lseek(p->fd, 0, SEEK_CUR);
if (pos != -1)
{
const off_t len2 = lseek(p->fd, 0, SEEK_END);
const off_t res2 = lseek(p->fd, pos, SEEK_SET);
if (len2 != -1)
{
*length = (UInt64)len2;
if (res2 != -1)
return 0;
}
}
return errno;
#endif
}
@@ -215,49 +388,56 @@ WRes File_GetLength(CSzFile *p, UInt64 *length)
/* ---------- FileSeqInStream ---------- */
static SRes FileSeqInStream_Read(void *pp, void *buf, size_t *size)
static SRes FileSeqInStream_Read(ISeqInStreamPtr pp, void *buf, size_t *size)
{
CFileSeqInStream *p = (CFileSeqInStream *)pp;
return File_Read(&p->file, buf, size) == 0 ? SZ_OK : SZ_ERROR_READ;
Z7_CONTAINER_FROM_VTBL_TO_DECL_VAR_pp_vt_p(CFileSeqInStream)
const WRes wres = File_Read(&p->file, buf, size);
p->wres = wres;
return (wres == 0) ? SZ_OK : SZ_ERROR_READ;
}
void FileSeqInStream_CreateVTable(CFileSeqInStream *p)
{
p->s.Read = FileSeqInStream_Read;
p->vt.Read = FileSeqInStream_Read;
}
/* ---------- FileInStream ---------- */
static SRes FileInStream_Read(void *pp, void *buf, size_t *size)
static SRes FileInStream_Read(ISeekInStreamPtr pp, void *buf, size_t *size)
{
CFileInStream *p = (CFileInStream *)pp;
return (File_Read(&p->file, buf, size) == 0) ? SZ_OK : SZ_ERROR_READ;
Z7_CONTAINER_FROM_VTBL_TO_DECL_VAR_pp_vt_p(CFileInStream)
const WRes wres = File_Read(&p->file, buf, size);
p->wres = wres;
return (wres == 0) ? SZ_OK : SZ_ERROR_READ;
}
static SRes FileInStream_Seek(void *pp, Int64 *pos, ESzSeek origin)
static SRes FileInStream_Seek(ISeekInStreamPtr pp, Int64 *pos, ESzSeek origin)
{
CFileInStream *p = (CFileInStream *)pp;
return File_Seek(&p->file, pos, origin);
Z7_CONTAINER_FROM_VTBL_TO_DECL_VAR_pp_vt_p(CFileInStream)
const WRes wres = File_Seek(&p->file, pos, origin);
p->wres = wres;
return (wres == 0) ? SZ_OK : SZ_ERROR_READ;
}
void FileInStream_CreateVTable(CFileInStream *p)
{
p->s.Read = FileInStream_Read;
p->s.Seek = FileInStream_Seek;
p->vt.Read = FileInStream_Read;
p->vt.Seek = FileInStream_Seek;
}
/* ---------- FileOutStream ---------- */
static size_t FileOutStream_Write(void *pp, const void *data, size_t size)
static size_t FileOutStream_Write(ISeqOutStreamPtr pp, const void *data, size_t size)
{
CFileOutStream *p = (CFileOutStream *)pp;
File_Write(&p->file, data, &size);
Z7_CONTAINER_FROM_VTBL_TO_DECL_VAR_pp_vt_p(CFileOutStream)
const WRes wres = File_Write(&p->file, data, &size);
p->wres = wres;
return size;
}
void FileOutStream_CreateVTable(CFileOutStream *p)
{
p->s.Write = FileOutStream_Write;
p->vt.Write = FileOutStream_Write;
}

View File

@@ -1,21 +1,26 @@
/* 7zFile.h -- File IO
2008-11-22 : Igor Pavlov : Public domain */
2023-03-05 : Igor Pavlov : Public domain */
#ifndef __7Z_FILE_H
#define __7Z_FILE_H
#ifndef ZIP7_INC_FILE_H
#define ZIP7_INC_FILE_H
#ifdef _WIN32
#define USE_WINDOWS_FILE
// #include <windows.h>
#endif
#ifdef USE_WINDOWS_FILE
#include <windows.h>
#include "7zWindows.h"
#else
#include <stdio.h>
// note: USE_FOPEN mode is limited to 32-bit file size
// #define USE_FOPEN
// #include <stdio.h>
#endif
#include "Types.h"
#include "7zTypes.h"
EXTERN_C_BEGIN
/* ---------- File ---------- */
@@ -23,14 +28,22 @@ typedef struct
{
#ifdef USE_WINDOWS_FILE
HANDLE handle;
#else
#elif defined(USE_FOPEN)
FILE *file;
#else
int fd;
#endif
} CSzFile;
void File_Construct(CSzFile *p);
#if !defined(UNDER_CE) || !defined(USE_WINDOWS_FILE)
WRes InFile_Open(CSzFile *p, const char *name);
WRes OutFile_Open(CSzFile *p, const char *name);
#endif
#ifdef USE_WINDOWS_FILE
WRes InFile_OpenW(CSzFile *p, const WCHAR *name);
WRes OutFile_OpenW(CSzFile *p, const WCHAR *name);
#endif
WRes File_Close(CSzFile *p);
/* reads max(*size, remain file's size) bytes */
@@ -47,8 +60,9 @@ WRes File_GetLength(CSzFile *p, UInt64 *length);
typedef struct
{
ISeqInStream s;
ISeqInStream vt;
CSzFile file;
WRes wres;
} CFileSeqInStream;
void FileSeqInStream_CreateVTable(CFileSeqInStream *p);
@@ -56,8 +70,9 @@ void FileSeqInStream_CreateVTable(CFileSeqInStream *p);
typedef struct
{
ISeekInStream s;
ISeekInStream vt;
CSzFile file;
WRes wres;
} CFileInStream;
void FileInStream_CreateVTable(CFileInStream *p);
@@ -65,10 +80,13 @@ void FileInStream_CreateVTable(CFileInStream *p);
typedef struct
{
ISeqOutStream s;
ISeqOutStream vt;
CSzFile file;
WRes wres;
} CFileOutStream;
void FileOutStream_CreateVTable(CFileOutStream *p);
EXTERN_C_END
#endif

View File

@@ -1,16 +1,39 @@
/* 7zStream.c -- 7z Stream functions
2008-11-23 : Igor Pavlov : Public domain */
2023-04-02 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include <string.h>
#include "Types.h"
#include "7zTypes.h"
SRes SeqInStream_Read2(ISeqInStream *stream, void *buf, size_t size, SRes errorType)
SRes SeqInStream_ReadMax(ISeqInStreamPtr stream, void *buf, size_t *processedSize)
{
size_t size = *processedSize;
*processedSize = 0;
while (size != 0)
{
size_t cur = size;
const SRes res = ISeqInStream_Read(stream, buf, &cur);
*processedSize += cur;
buf = (void *)((Byte *)buf + cur);
size -= cur;
if (res != SZ_OK)
return res;
if (cur == 0)
return SZ_OK;
}
return SZ_OK;
}
/*
SRes SeqInStream_Read2(ISeqInStreamPtr stream, void *buf, size_t size, SRes errorType)
{
while (size != 0)
{
size_t processed = size;
RINOK(stream->Read(stream, buf, &processed));
RINOK(ISeqInStream_Read(stream, buf, &processed))
if (processed == 0)
return errorType;
buf = (void *)((Byte *)buf + processed);
@@ -19,40 +42,44 @@ SRes SeqInStream_Read2(ISeqInStream *stream, void *buf, size_t size, SRes errorT
return SZ_OK;
}
SRes SeqInStream_Read(ISeqInStream *stream, void *buf, size_t size)
SRes SeqInStream_Read(ISeqInStreamPtr stream, void *buf, size_t size)
{
return SeqInStream_Read2(stream, buf, size, SZ_ERROR_INPUT_EOF);
}
*/
SRes SeqInStream_ReadByte(ISeqInStream *stream, Byte *buf)
SRes SeqInStream_ReadByte(ISeqInStreamPtr stream, Byte *buf)
{
size_t processed = 1;
RINOK(stream->Read(stream, buf, &processed));
RINOK(ISeqInStream_Read(stream, buf, &processed))
return (processed == 1) ? SZ_OK : SZ_ERROR_INPUT_EOF;
}
SRes LookInStream_SeekTo(ILookInStream *stream, UInt64 offset)
SRes LookInStream_SeekTo(ILookInStreamPtr stream, UInt64 offset)
{
Int64 t = offset;
return stream->Seek(stream, &t, SZ_SEEK_SET);
Int64 t = (Int64)offset;
return ILookInStream_Seek(stream, &t, SZ_SEEK_SET);
}
SRes LookInStream_LookRead(ILookInStream *stream, void *buf, size_t *size)
SRes LookInStream_LookRead(ILookInStreamPtr stream, void *buf, size_t *size)
{
void *lookBuf;
const void *lookBuf;
if (*size == 0)
return SZ_OK;
RINOK(stream->Look(stream, &lookBuf, size));
RINOK(ILookInStream_Look(stream, &lookBuf, size))
memcpy(buf, lookBuf, *size);
return stream->Skip(stream, *size);
return ILookInStream_Skip(stream, *size);
}
SRes LookInStream_Read2(ILookInStream *stream, void *buf, size_t size, SRes errorType)
SRes LookInStream_Read2(ILookInStreamPtr stream, void *buf, size_t size, SRes errorType)
{
while (size != 0)
{
size_t processed = size;
RINOK(stream->Read(stream, buf, &processed));
RINOK(ILookInStream_Read(stream, buf, &processed))
if (processed == 0)
return errorType;
buf = (void *)((Byte *)buf + processed);
@@ -61,61 +88,67 @@ SRes LookInStream_Read2(ILookInStream *stream, void *buf, size_t size, SRes erro
return SZ_OK;
}
SRes LookInStream_Read(ILookInStream *stream, void *buf, size_t size)
SRes LookInStream_Read(ILookInStreamPtr stream, void *buf, size_t size)
{
return LookInStream_Read2(stream, buf, size, SZ_ERROR_INPUT_EOF);
}
static SRes LookToRead_Look_Lookahead(void *pp, void **buf, size_t *size)
#define GET_LookToRead2 Z7_CONTAINER_FROM_VTBL_TO_DECL_VAR_pp_vt_p(CLookToRead2)
static SRes LookToRead2_Look_Lookahead(ILookInStreamPtr pp, const void **buf, size_t *size)
{
SRes res = SZ_OK;
CLookToRead *p = (CLookToRead *)pp;
GET_LookToRead2
size_t size2 = p->size - p->pos;
if (size2 == 0 && *size > 0)
if (size2 == 0 && *size != 0)
{
p->pos = 0;
size2 = LookToRead_BUF_SIZE;
res = p->realStream->Read(p->realStream, p->buf, &size2);
p->size = 0;
size2 = p->bufSize;
res = ISeekInStream_Read(p->realStream, p->buf, &size2);
p->size = size2;
}
if (size2 < *size)
if (*size > size2)
*size = size2;
*buf = p->buf + p->pos;
return res;
}
static SRes LookToRead_Look_Exact(void *pp, void **buf, size_t *size)
static SRes LookToRead2_Look_Exact(ILookInStreamPtr pp, const void **buf, size_t *size)
{
SRes res = SZ_OK;
CLookToRead *p = (CLookToRead *)pp;
GET_LookToRead2
size_t size2 = p->size - p->pos;
if (size2 == 0 && *size > 0)
if (size2 == 0 && *size != 0)
{
p->pos = 0;
if (*size > LookToRead_BUF_SIZE)
*size = LookToRead_BUF_SIZE;
res = p->realStream->Read(p->realStream, p->buf, size);
p->size = 0;
if (*size > p->bufSize)
*size = p->bufSize;
res = ISeekInStream_Read(p->realStream, p->buf, size);
size2 = p->size = *size;
}
if (size2 < *size)
if (*size > size2)
*size = size2;
*buf = p->buf + p->pos;
return res;
}
static SRes LookToRead_Skip(void *pp, size_t offset)
static SRes LookToRead2_Skip(ILookInStreamPtr pp, size_t offset)
{
CLookToRead *p = (CLookToRead *)pp;
GET_LookToRead2
p->pos += offset;
return SZ_OK;
}
static SRes LookToRead_Read(void *pp, void *buf, size_t *size)
static SRes LookToRead2_Read(ILookInStreamPtr pp, void *buf, size_t *size)
{
CLookToRead *p = (CLookToRead *)pp;
GET_LookToRead2
size_t rem = p->size - p->pos;
if (rem == 0)
return p->realStream->Read(p->realStream, buf, size);
return ISeekInStream_Read(p->realStream, buf, size);
if (rem > *size)
rem = *size;
memcpy(buf, p->buf + p->pos, rem);
@@ -124,46 +157,43 @@ static SRes LookToRead_Read(void *pp, void *buf, size_t *size)
return SZ_OK;
}
static SRes LookToRead_Seek(void *pp, Int64 *pos, ESzSeek origin)
static SRes LookToRead2_Seek(ILookInStreamPtr pp, Int64 *pos, ESzSeek origin)
{
CLookToRead *p = (CLookToRead *)pp;
GET_LookToRead2
p->pos = p->size = 0;
return p->realStream->Seek(p->realStream, pos, origin);
return ISeekInStream_Seek(p->realStream, pos, origin);
}
void LookToRead_CreateVTable(CLookToRead *p, int lookahead)
void LookToRead2_CreateVTable(CLookToRead2 *p, int lookahead)
{
p->s.Look = lookahead ?
LookToRead_Look_Lookahead :
LookToRead_Look_Exact;
p->s.Skip = LookToRead_Skip;
p->s.Read = LookToRead_Read;
p->s.Seek = LookToRead_Seek;
p->vt.Look = lookahead ?
LookToRead2_Look_Lookahead :
LookToRead2_Look_Exact;
p->vt.Skip = LookToRead2_Skip;
p->vt.Read = LookToRead2_Read;
p->vt.Seek = LookToRead2_Seek;
}
void LookToRead_Init(CLookToRead *p)
{
p->pos = p->size = 0;
}
static SRes SecToLook_Read(void *pp, void *buf, size_t *size)
static SRes SecToLook_Read(ISeqInStreamPtr pp, void *buf, size_t *size)
{
CSecToLook *p = (CSecToLook *)pp;
Z7_CONTAINER_FROM_VTBL_TO_DECL_VAR_pp_vt_p(CSecToLook)
return LookInStream_LookRead(p->realStream, buf, size);
}
void SecToLook_CreateVTable(CSecToLook *p)
{
p->s.Read = SecToLook_Read;
p->vt.Read = SecToLook_Read;
}
static SRes SecToRead_Read(void *pp, void *buf, size_t *size)
static SRes SecToRead_Read(ISeqInStreamPtr pp, void *buf, size_t *size)
{
CSecToRead *p = (CSecToRead *)pp;
return p->realStream->Read(p->realStream, buf, size);
Z7_CONTAINER_FROM_VTBL_TO_DECL_VAR_pp_vt_p(CSecToRead)
return ILookInStream_Read(p->realStream, buf, size);
}
void SecToRead_CreateVTable(CSecToRead *p)
{
p->s.Read = SecToRead_Read;
p->vt.Read = SecToRead_Read;
}

597
C/7zTypes.h Executable file
View File

@@ -0,0 +1,597 @@
/* 7zTypes.h -- Basic types
2023-04-02 : Igor Pavlov : Public domain */
#ifndef ZIP7_7Z_TYPES_H
#define ZIP7_7Z_TYPES_H
#ifdef _WIN32
/* #include <windows.h> */
#else
#include <errno.h>
#endif
#include <stddef.h>
#ifndef EXTERN_C_BEGIN
#ifdef __cplusplus
#define EXTERN_C_BEGIN extern "C" {
#define EXTERN_C_END }
#else
#define EXTERN_C_BEGIN
#define EXTERN_C_END
#endif
#endif
EXTERN_C_BEGIN
#define SZ_OK 0
#define SZ_ERROR_DATA 1
#define SZ_ERROR_MEM 2
#define SZ_ERROR_CRC 3
#define SZ_ERROR_UNSUPPORTED 4
#define SZ_ERROR_PARAM 5
#define SZ_ERROR_INPUT_EOF 6
#define SZ_ERROR_OUTPUT_EOF 7
#define SZ_ERROR_READ 8
#define SZ_ERROR_WRITE 9
#define SZ_ERROR_PROGRESS 10
#define SZ_ERROR_FAIL 11
#define SZ_ERROR_THREAD 12
#define SZ_ERROR_ARCHIVE 16
#define SZ_ERROR_NO_ARCHIVE 17
typedef int SRes;
#ifdef _MSC_VER
#if _MSC_VER > 1200
#define MY_ALIGN(n) __declspec(align(n))
#else
#define MY_ALIGN(n)
#endif
#else
/*
// C11/C++11:
#include <stdalign.h>
#define MY_ALIGN(n) alignas(n)
*/
#define MY_ALIGN(n) __attribute__ ((aligned(n)))
#endif
#ifdef _WIN32
/* typedef DWORD WRes; */
typedef unsigned WRes;
#define MY_SRes_HRESULT_FROM_WRes(x) HRESULT_FROM_WIN32(x)
// #define MY_HRES_ERROR_INTERNAL_ERROR MY_SRes_HRESULT_FROM_WRes(ERROR_INTERNAL_ERROR)
#else // _WIN32
// #define ENV_HAVE_LSTAT
typedef int WRes;
// (FACILITY_ERRNO = 0x800) is 7zip's FACILITY constant to represent (errno) errors in HRESULT
#define MY_FACILITY_ERRNO 0x800
#define MY_FACILITY_WIN32 7
#define MY_FACILITY_WRes MY_FACILITY_ERRNO
#define MY_HRESULT_FROM_errno_CONST_ERROR(x) ((HRESULT)( \
( (HRESULT)(x) & 0x0000FFFF) \
| (MY_FACILITY_WRes << 16) \
| (HRESULT)0x80000000 ))
#define MY_SRes_HRESULT_FROM_WRes(x) \
((HRESULT)(x) <= 0 ? ((HRESULT)(x)) : MY_HRESULT_FROM_errno_CONST_ERROR(x))
// we call macro HRESULT_FROM_WIN32 for system errors (WRes) that are (errno)
#define HRESULT_FROM_WIN32(x) MY_SRes_HRESULT_FROM_WRes(x)
/*
#define ERROR_FILE_NOT_FOUND 2L
#define ERROR_ACCESS_DENIED 5L
#define ERROR_NO_MORE_FILES 18L
#define ERROR_LOCK_VIOLATION 33L
#define ERROR_FILE_EXISTS 80L
#define ERROR_DISK_FULL 112L
#define ERROR_NEGATIVE_SEEK 131L
#define ERROR_ALREADY_EXISTS 183L
#define ERROR_DIRECTORY 267L
#define ERROR_TOO_MANY_POSTS 298L
#define ERROR_INTERNAL_ERROR 1359L
#define ERROR_INVALID_REPARSE_DATA 4392L
#define ERROR_REPARSE_TAG_INVALID 4393L
#define ERROR_REPARSE_TAG_MISMATCH 4394L
*/
// we use errno equivalents for some WIN32 errors:
#define ERROR_INVALID_PARAMETER EINVAL
#define ERROR_INVALID_FUNCTION EINVAL
#define ERROR_ALREADY_EXISTS EEXIST
#define ERROR_FILE_EXISTS EEXIST
#define ERROR_PATH_NOT_FOUND ENOENT
#define ERROR_FILE_NOT_FOUND ENOENT
#define ERROR_DISK_FULL ENOSPC
// #define ERROR_INVALID_HANDLE EBADF
// we use FACILITY_WIN32 for errors that has no errno equivalent
// Too many posts were made to a semaphore.
#define ERROR_TOO_MANY_POSTS ((HRESULT)0x8007012AL)
#define ERROR_INVALID_REPARSE_DATA ((HRESULT)0x80071128L)
#define ERROR_REPARSE_TAG_INVALID ((HRESULT)0x80071129L)
// if (MY_FACILITY_WRes != FACILITY_WIN32),
// we use FACILITY_WIN32 for COM errors:
#define E_OUTOFMEMORY ((HRESULT)0x8007000EL)
#define E_INVALIDARG ((HRESULT)0x80070057L)
#define MY_E_ERROR_NEGATIVE_SEEK ((HRESULT)0x80070083L)
/*
// we can use FACILITY_ERRNO for some COM errors, that have errno equivalents:
#define E_OUTOFMEMORY MY_HRESULT_FROM_errno_CONST_ERROR(ENOMEM)
#define E_INVALIDARG MY_HRESULT_FROM_errno_CONST_ERROR(EINVAL)
#define MY_E_ERROR_NEGATIVE_SEEK MY_HRESULT_FROM_errno_CONST_ERROR(EINVAL)
*/
#define TEXT(quote) quote
#define FILE_ATTRIBUTE_READONLY 0x0001
#define FILE_ATTRIBUTE_HIDDEN 0x0002
#define FILE_ATTRIBUTE_SYSTEM 0x0004
#define FILE_ATTRIBUTE_DIRECTORY 0x0010
#define FILE_ATTRIBUTE_ARCHIVE 0x0020
#define FILE_ATTRIBUTE_DEVICE 0x0040
#define FILE_ATTRIBUTE_NORMAL 0x0080
#define FILE_ATTRIBUTE_TEMPORARY 0x0100
#define FILE_ATTRIBUTE_SPARSE_FILE 0x0200
#define FILE_ATTRIBUTE_REPARSE_POINT 0x0400
#define FILE_ATTRIBUTE_COMPRESSED 0x0800
#define FILE_ATTRIBUTE_OFFLINE 0x1000
#define FILE_ATTRIBUTE_NOT_CONTENT_INDEXED 0x2000
#define FILE_ATTRIBUTE_ENCRYPTED 0x4000
#define FILE_ATTRIBUTE_UNIX_EXTENSION 0x8000 /* trick for Unix */
#endif
#ifndef RINOK
#define RINOK(x) { const int _result_ = (x); if (_result_ != 0) return _result_; }
#endif
#ifndef RINOK_WRes
#define RINOK_WRes(x) { const WRes _result_ = (x); if (_result_ != 0) return _result_; }
#endif
typedef unsigned char Byte;
typedef short Int16;
typedef unsigned short UInt16;
#ifdef Z7_DECL_Int32_AS_long
typedef long Int32;
typedef unsigned long UInt32;
#else
typedef int Int32;
typedef unsigned int UInt32;
#endif
#ifndef _WIN32
typedef int INT;
typedef Int32 INT32;
typedef unsigned int UINT;
typedef UInt32 UINT32;
typedef INT32 LONG; // LONG, ULONG and DWORD must be 32-bit for _WIN32 compatibility
typedef UINT32 ULONG;
#undef DWORD
typedef UINT32 DWORD;
#define VOID void
#define HRESULT LONG
typedef void *LPVOID;
// typedef void VOID;
// typedef ULONG_PTR DWORD_PTR, *PDWORD_PTR;
// gcc / clang on Unix : sizeof(long==sizeof(void*) in 32 or 64 bits)
typedef long INT_PTR;
typedef unsigned long UINT_PTR;
typedef long LONG_PTR;
typedef unsigned long DWORD_PTR;
typedef size_t SIZE_T;
#endif // _WIN32
#define MY_HRES_ERROR_INTERNAL_ERROR ((HRESULT)0x8007054FL)
#ifdef Z7_DECL_Int64_AS_long
typedef long Int64;
typedef unsigned long UInt64;
#else
#if (defined(_MSC_VER) || defined(__BORLANDC__)) && !defined(__clang__)
typedef __int64 Int64;
typedef unsigned __int64 UInt64;
#else
#if defined(__clang__) || defined(__GNUC__)
#include <stdint.h>
typedef int64_t Int64;
typedef uint64_t UInt64;
#else
typedef long long int Int64;
typedef unsigned long long int UInt64;
// #define UINT64_CONST(n) n ## ULL
#endif
#endif
#endif
#define UINT64_CONST(n) n
#ifdef Z7_DECL_SizeT_AS_unsigned_int
typedef unsigned int SizeT;
#else
typedef size_t SizeT;
#endif
/*
#if (defined(_MSC_VER) && _MSC_VER <= 1200)
typedef size_t MY_uintptr_t;
#else
#include <stdint.h>
typedef uintptr_t MY_uintptr_t;
#endif
*/
typedef int BoolInt;
/* typedef BoolInt Bool; */
#define True 1
#define False 0
#ifdef _WIN32
#define Z7_STDCALL __stdcall
#else
#define Z7_STDCALL
#endif
#ifdef _MSC_VER
#if _MSC_VER >= 1300
#define Z7_NO_INLINE __declspec(noinline)
#else
#define Z7_NO_INLINE
#endif
#define Z7_FORCE_INLINE __forceinline
#define Z7_CDECL __cdecl
#define Z7_FASTCALL __fastcall
#else // _MSC_VER
#if (defined(__GNUC__) && (__GNUC__ >= 4)) \
|| (defined(__clang__) && (__clang_major__ >= 4)) \
|| defined(__INTEL_COMPILER) \
|| defined(__xlC__)
#define Z7_NO_INLINE __attribute__((noinline))
#define Z7_FORCE_INLINE __attribute__((always_inline)) inline
#else
#define Z7_NO_INLINE
#define Z7_FORCE_INLINE
#endif
#define Z7_CDECL
#if defined(_M_IX86) \
|| defined(__i386__)
// #define Z7_FASTCALL __attribute__((fastcall))
// #define Z7_FASTCALL __attribute__((cdecl))
#define Z7_FASTCALL
#elif defined(MY_CPU_AMD64)
// #define Z7_FASTCALL __attribute__((ms_abi))
#define Z7_FASTCALL
#else
#define Z7_FASTCALL
#endif
#endif // _MSC_VER
/* The following interfaces use first parameter as pointer to structure */
// #define Z7_C_IFACE_CONST_QUAL
#define Z7_C_IFACE_CONST_QUAL const
#define Z7_C_IFACE_DECL(a) \
struct a ## _; \
typedef Z7_C_IFACE_CONST_QUAL struct a ## _ * a ## Ptr; \
typedef struct a ## _ a; \
struct a ## _
Z7_C_IFACE_DECL (IByteIn)
{
Byte (*Read)(IByteInPtr p); /* reads one byte, returns 0 in case of EOF or error */
};
#define IByteIn_Read(p) (p)->Read(p)
Z7_C_IFACE_DECL (IByteOut)
{
void (*Write)(IByteOutPtr p, Byte b);
};
#define IByteOut_Write(p, b) (p)->Write(p, b)
Z7_C_IFACE_DECL (ISeqInStream)
{
SRes (*Read)(ISeqInStreamPtr p, void *buf, size_t *size);
/* if (input(*size) != 0 && output(*size) == 0) means end_of_stream.
(output(*size) < input(*size)) is allowed */
};
#define ISeqInStream_Read(p, buf, size) (p)->Read(p, buf, size)
/* try to read as much as avail in stream and limited by (*processedSize) */
SRes SeqInStream_ReadMax(ISeqInStreamPtr stream, void *buf, size_t *processedSize);
/* it can return SZ_ERROR_INPUT_EOF */
// SRes SeqInStream_Read(ISeqInStreamPtr stream, void *buf, size_t size);
// SRes SeqInStream_Read2(ISeqInStreamPtr stream, void *buf, size_t size, SRes errorType);
SRes SeqInStream_ReadByte(ISeqInStreamPtr stream, Byte *buf);
Z7_C_IFACE_DECL (ISeqOutStream)
{
size_t (*Write)(ISeqOutStreamPtr p, const void *buf, size_t size);
/* Returns: result - the number of actually written bytes.
(result < size) means error */
};
#define ISeqOutStream_Write(p, buf, size) (p)->Write(p, buf, size)
typedef enum
{
SZ_SEEK_SET = 0,
SZ_SEEK_CUR = 1,
SZ_SEEK_END = 2
} ESzSeek;
Z7_C_IFACE_DECL (ISeekInStream)
{
SRes (*Read)(ISeekInStreamPtr p, void *buf, size_t *size); /* same as ISeqInStream::Read */
SRes (*Seek)(ISeekInStreamPtr p, Int64 *pos, ESzSeek origin);
};
#define ISeekInStream_Read(p, buf, size) (p)->Read(p, buf, size)
#define ISeekInStream_Seek(p, pos, origin) (p)->Seek(p, pos, origin)
Z7_C_IFACE_DECL (ILookInStream)
{
SRes (*Look)(ILookInStreamPtr p, const void **buf, size_t *size);
/* if (input(*size) != 0 && output(*size) == 0) means end_of_stream.
(output(*size) > input(*size)) is not allowed
(output(*size) < input(*size)) is allowed */
SRes (*Skip)(ILookInStreamPtr p, size_t offset);
/* offset must be <= output(*size) of Look */
SRes (*Read)(ILookInStreamPtr p, void *buf, size_t *size);
/* reads directly (without buffer). It's same as ISeqInStream::Read */
SRes (*Seek)(ILookInStreamPtr p, Int64 *pos, ESzSeek origin);
};
#define ILookInStream_Look(p, buf, size) (p)->Look(p, buf, size)
#define ILookInStream_Skip(p, offset) (p)->Skip(p, offset)
#define ILookInStream_Read(p, buf, size) (p)->Read(p, buf, size)
#define ILookInStream_Seek(p, pos, origin) (p)->Seek(p, pos, origin)
SRes LookInStream_LookRead(ILookInStreamPtr stream, void *buf, size_t *size);
SRes LookInStream_SeekTo(ILookInStreamPtr stream, UInt64 offset);
/* reads via ILookInStream::Read */
SRes LookInStream_Read2(ILookInStreamPtr stream, void *buf, size_t size, SRes errorType);
SRes LookInStream_Read(ILookInStreamPtr stream, void *buf, size_t size);
typedef struct
{
ILookInStream vt;
ISeekInStreamPtr realStream;
size_t pos;
size_t size; /* it's data size */
/* the following variables must be set outside */
Byte *buf;
size_t bufSize;
} CLookToRead2;
void LookToRead2_CreateVTable(CLookToRead2 *p, int lookahead);
#define LookToRead2_INIT(p) { (p)->pos = (p)->size = 0; }
typedef struct
{
ISeqInStream vt;
ILookInStreamPtr realStream;
} CSecToLook;
void SecToLook_CreateVTable(CSecToLook *p);
typedef struct
{
ISeqInStream vt;
ILookInStreamPtr realStream;
} CSecToRead;
void SecToRead_CreateVTable(CSecToRead *p);
Z7_C_IFACE_DECL (ICompressProgress)
{
SRes (*Progress)(ICompressProgressPtr p, UInt64 inSize, UInt64 outSize);
/* Returns: result. (result != SZ_OK) means break.
Value (UInt64)(Int64)-1 for size means unknown value. */
};
#define ICompressProgress_Progress(p, inSize, outSize) (p)->Progress(p, inSize, outSize)
typedef struct ISzAlloc ISzAlloc;
typedef const ISzAlloc * ISzAllocPtr;
struct ISzAlloc
{
void *(*Alloc)(ISzAllocPtr p, size_t size);
void (*Free)(ISzAllocPtr p, void *address); /* address can be 0 */
};
#define ISzAlloc_Alloc(p, size) (p)->Alloc(p, size)
#define ISzAlloc_Free(p, a) (p)->Free(p, a)
/* deprecated */
#define IAlloc_Alloc(p, size) ISzAlloc_Alloc(p, size)
#define IAlloc_Free(p, a) ISzAlloc_Free(p, a)
#ifndef MY_offsetof
#ifdef offsetof
#define MY_offsetof(type, m) offsetof(type, m)
/*
#define MY_offsetof(type, m) FIELD_OFFSET(type, m)
*/
#else
#define MY_offsetof(type, m) ((size_t)&(((type *)0)->m))
#endif
#endif
#ifndef Z7_container_of
/*
#define Z7_container_of(ptr, type, m) container_of(ptr, type, m)
#define Z7_container_of(ptr, type, m) CONTAINING_RECORD(ptr, type, m)
#define Z7_container_of(ptr, type, m) ((type *)((char *)(ptr) - offsetof(type, m)))
#define Z7_container_of(ptr, type, m) (&((type *)0)->m == (ptr), ((type *)(((char *)(ptr)) - MY_offsetof(type, m))))
*/
/*
GCC shows warning: "perhaps the 'offsetof' macro was used incorrectly"
GCC 3.4.4 : classes with constructor
GCC 4.8.1 : classes with non-public variable members"
*/
#define Z7_container_of(ptr, type, m) \
((type *)(void *)((char *)(void *) \
(1 ? (ptr) : &((type *)NULL)->m) - MY_offsetof(type, m)))
#define Z7_container_of_CONST(ptr, type, m) \
((const type *)(const void *)((const char *)(const void *) \
(1 ? (ptr) : &((type *)NULL)->m) - MY_offsetof(type, m)))
/*
#define Z7_container_of_NON_CONST_FROM_CONST(ptr, type, m) \
((type *)(void *)(const void *)((const char *)(const void *) \
(1 ? (ptr) : &((type *)NULL)->m) - MY_offsetof(type, m)))
*/
#endif
#define Z7_CONTAINER_FROM_VTBL_SIMPLE(ptr, type, m) ((type *)(void *)(ptr))
// #define Z7_CONTAINER_FROM_VTBL(ptr, type, m) Z7_CONTAINER_FROM_VTBL_SIMPLE(ptr, type, m)
#define Z7_CONTAINER_FROM_VTBL(ptr, type, m) Z7_container_of(ptr, type, m)
// #define Z7_CONTAINER_FROM_VTBL(ptr, type, m) Z7_container_of_NON_CONST_FROM_CONST(ptr, type, m)
#define Z7_CONTAINER_FROM_VTBL_CONST(ptr, type, m) Z7_container_of_CONST(ptr, type, m)
#define Z7_CONTAINER_FROM_VTBL_CLS(ptr, type, m) Z7_CONTAINER_FROM_VTBL_SIMPLE(ptr, type, m)
/*
#define Z7_CONTAINER_FROM_VTBL_CLS(ptr, type, m) Z7_CONTAINER_FROM_VTBL(ptr, type, m)
*/
#if defined (__clang__) || defined(__GNUC__)
#define Z7_DIAGNOSCTIC_IGNORE_BEGIN_CAST_QUAL \
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wcast-qual\"")
#define Z7_DIAGNOSCTIC_IGNORE_END_CAST_QUAL \
_Pragma("GCC diagnostic pop")
#else
#define Z7_DIAGNOSCTIC_IGNORE_BEGIN_CAST_QUAL
#define Z7_DIAGNOSCTIC_IGNORE_END_CAST_QUAL
#endif
#define Z7_CONTAINER_FROM_VTBL_TO_DECL_VAR(ptr, type, m, p) \
Z7_DIAGNOSCTIC_IGNORE_BEGIN_CAST_QUAL \
type *p = Z7_CONTAINER_FROM_VTBL(ptr, type, m); \
Z7_DIAGNOSCTIC_IGNORE_END_CAST_QUAL
#define Z7_CONTAINER_FROM_VTBL_TO_DECL_VAR_pp_vt_p(type) \
Z7_CONTAINER_FROM_VTBL_TO_DECL_VAR(pp, type, vt, p)
// #define ZIP7_DECLARE_HANDLE(name) typedef void *name;
#define Z7_DECLARE_HANDLE(name) struct name##_dummy{int unused;}; typedef struct name##_dummy *name;
#define Z7_memset_0_ARRAY(a) memset((a), 0, sizeof(a))
#ifndef Z7_ARRAY_SIZE
#define Z7_ARRAY_SIZE(a) (sizeof(a) / sizeof((a)[0]))
#endif
#ifdef _WIN32
#define CHAR_PATH_SEPARATOR '\\'
#define WCHAR_PATH_SEPARATOR L'\\'
#define STRING_PATH_SEPARATOR "\\"
#define WSTRING_PATH_SEPARATOR L"\\"
#else
#define CHAR_PATH_SEPARATOR '/'
#define WCHAR_PATH_SEPARATOR L'/'
#define STRING_PATH_SEPARATOR "/"
#define WSTRING_PATH_SEPARATOR L"/"
#endif
#define k_PropVar_TimePrec_0 0
#define k_PropVar_TimePrec_Unix 1
#define k_PropVar_TimePrec_DOS 2
#define k_PropVar_TimePrec_HighPrec 3
#define k_PropVar_TimePrec_Base 16
#define k_PropVar_TimePrec_100ns (k_PropVar_TimePrec_Base + 7)
#define k_PropVar_TimePrec_1ns (k_PropVar_TimePrec_Base + 9)
EXTERN_C_END
#endif
/*
#ifndef Z7_ST
#ifdef _7ZIP_ST
#define Z7_ST
#endif
#endif
*/

View File

@@ -1,7 +1,42 @@
#define MY_VER_MAJOR 4
#define MY_VER_MINOR 64
#define MY_VER_MAJOR 23
#define MY_VER_MINOR 01
#define MY_VER_BUILD 0
#define MY_VERSION "4.64"
#define MY_DATE "2009-01-03"
#define MY_COPYRIGHT ": Igor Pavlov : Public domain"
#define MY_VERSION_COPYRIGHT_DATE MY_VERSION " " MY_COPYRIGHT " : " MY_DATE
#define MY_VERSION_NUMBERS "23.01"
#define MY_VERSION MY_VERSION_NUMBERS
#ifdef MY_CPU_NAME
#define MY_VERSION_CPU MY_VERSION " (" MY_CPU_NAME ")"
#else
#define MY_VERSION_CPU MY_VERSION
#endif
#define MY_DATE "2023-06-20"
#undef MY_COPYRIGHT
#undef MY_VERSION_COPYRIGHT_DATE
#define MY_AUTHOR_NAME "Igor Pavlov"
#define MY_COPYRIGHT_PD "Igor Pavlov : Public domain"
#define MY_COPYRIGHT_CR "Copyright (c) 1999-2023 Igor Pavlov"
#ifdef USE_COPYRIGHT_CR
#define MY_COPYRIGHT MY_COPYRIGHT_CR
#else
#define MY_COPYRIGHT MY_COPYRIGHT_PD
#endif
#define MY_COPYRIGHT_DATE MY_COPYRIGHT " : " MY_DATE
#define MY_VERSION_COPYRIGHT_DATE MY_VERSION_CPU " : " MY_COPYRIGHT " : " MY_DATE
#define MY_EASY7ZIP_VER_MAJOR 0
#define MY_EASY7ZIP_VER_MINOR 1
#define MY_EASY7ZIP_7ZIP "Easy 7-Zip"
#define MY_EASY7ZIP_VERSION "0.1.6-shunf4-2"
#define MY_EASY7ZIP_7ZIP_VERSION "Easy 7-Zip v0.1.6-shunf4-2"
#define MY_EASY7ZIP_COPYRIGHT "Portions Copyright (C) 2013-2016 James Hoo"
#define MY_EASY7ZIP_AUTHOR "James Hoo"
#define MY_EASY7ZIP_HOMEPAGE "e7z.org"
#define MY_EASY7ZIP_SPECIAL_BUILD MY_EASY7ZIP_7ZIP_VERSION " (www." MY_EASY7ZIP_HOMEPAGE ") made by " MY_EASY7ZIP_AUTHOR

55
C/7zVersion.rc Executable file
View File

@@ -0,0 +1,55 @@
#define MY_VS_FFI_FILEFLAGSMASK 0x0000003FL
#define MY_VOS_NT_WINDOWS32 0x00040004L
#define MY_VOS_CE_WINDOWS32 0x00050004L
#define MY_VFT_APP 0x00000001L
#define MY_VFT_DLL 0x00000002L
// #include <WinVer.h>
#ifndef MY_VERSION
#include "7zVersion.h"
#endif
#define MY_VER MY_VER_MAJOR,MY_VER_MINOR,MY_VER_BUILD,0
#ifdef DEBUG
#define DBG_FL VS_FF_DEBUG
#else
#define DBG_FL 0
#endif
#define MY_VERSION_INFO(fileType, descr, intName, origName) \
LANGUAGE 9, 1 \
1 VERSIONINFO \
FILEVERSION MY_VER \
PRODUCTVERSION MY_VER \
FILEFLAGSMASK MY_VS_FFI_FILEFLAGSMASK \
FILEFLAGS DBG_FL \
FILEOS MY_VOS_NT_WINDOWS32 \
FILETYPE fileType \
FILESUBTYPE 0x0L \
BEGIN \
BLOCK "StringFileInfo" \
BEGIN \
BLOCK "040904b0" \
BEGIN \
VALUE "CompanyName", "Igor Pavlov" \
VALUE "FileDescription", descr \
VALUE "FileVersion", MY_VERSION \
VALUE "InternalName", intName \
VALUE "LegalCopyright", MY_COPYRIGHT \
VALUE "OriginalFilename", origName \
VALUE "ProductName", "7-Zip" \
VALUE "ProductVersion", MY_VERSION \
END \
END \
BLOCK "VarFileInfo" \
BEGIN \
VALUE "Translation", 0x409, 1200 \
END \
END
#define MY_VERSION_INFO_APP(descr, intName) MY_VERSION_INFO(MY_VFT_APP, descr, intName, intName ".exe")
#define MY_VERSION_INFO_DLL(descr, intName) MY_VERSION_INFO(MY_VFT_DLL, descr, intName, intName ".dll")

103
C/7zWindows.h Executable file
View File

@@ -0,0 +1,103 @@
/* 7zWindows.h -- StdAfx
2023-04-02 : Igor Pavlov : Public domain */
#ifndef ZIP7_INC_7Z_WINDOWS_H
#define ZIP7_INC_7Z_WINDOWS_H
#ifdef _WIN32
#if defined(__clang__)
# pragma clang diagnostic push
#endif
#pragma warning(disable : 4255)
#if defined(_MSC_VER)
#pragma warning(push)
#pragma warning(disable : 4668) // '_WIN32_WINNT' is not defined as a preprocessor macro, replacing with '0' for '#if/#elif'
#if _MSC_VER == 1900
// for old kit10 versions
#pragma warning(disable : 4255) // winuser.h(13979): warning C4255: 'GetThreadDpiAwarenessContext':
#endif
// win10 Windows Kit:
#endif // _MSC_VER
#if defined(_MSC_VER) && _MSC_VER <= 1200 && !defined(_WIN64)
// for msvc6 without sdk2003
#define RPC_NO_WINDOWS_H
#endif
#if defined(__MINGW32__) || defined(__MINGW64__)
// #if defined(__GNUC__) && !defined(__clang__)
#include <windows.h>
#else
#include <Windows.h>
#endif
// #include <basetsd.h>
// #include <wtypes.h>
// but if precompiled with clang-cl then we need
// #include <windows.h>
#if defined(_MSC_VER)
#pragma warning(pop)
#endif
#if defined(__clang__)
# pragma clang diagnostic pop
#endif
#if defined(_MSC_VER) && _MSC_VER <= 1200 && !defined(_WIN64)
#ifndef _W64
typedef long LONG_PTR, *PLONG_PTR;
typedef unsigned long ULONG_PTR, *PULONG_PTR;
typedef ULONG_PTR DWORD_PTR, *PDWORD_PTR;
#define Z7_OLD_WIN_SDK
#endif // _W64
#endif // _MSC_VER == 1200
#ifdef Z7_OLD_WIN_SDK
#ifndef INVALID_FILE_ATTRIBUTES
#define INVALID_FILE_ATTRIBUTES ((DWORD)-1)
#endif
#ifndef INVALID_SET_FILE_POINTER
#define INVALID_SET_FILE_POINTER ((DWORD)-1)
#endif
#ifndef FILE_SPECIAL_ACCESS
#define FILE_SPECIAL_ACCESS (FILE_ANY_ACCESS)
#endif
// ShlObj.h:
// #define BIF_NEWDIALOGSTYLE 0x0040
#pragma warning(disable : 4201)
// #pragma warning(disable : 4115)
#undef VARIANT_TRUE
#define VARIANT_TRUE ((VARIANT_BOOL)-1)
#endif
#endif // Z7_OLD_WIN_SDK
#ifdef UNDER_CE
#undef VARIANT_TRUE
#define VARIANT_TRUE ((VARIANT_BOOL)-1)
#endif
#if defined(_MSC_VER)
#if _MSC_VER >= 1400 && _MSC_VER <= 1600
// BaseTsd.h(148) : 'HandleToULong' : unreferenced inline function has been removed
// string.h
// #pragma warning(disable : 4514)
#endif
#endif
/* #include "7zTypes.h" */
#endif

360
C/7zip_gcc_c.mak Executable file
View File

@@ -0,0 +1,360 @@
MY_ARCH_2 = $(MY_ARCH)
MY_ASM = jwasm
MY_ASM = asmc
ifndef RC
#RC=windres.exe --target=pe-x86-64
#RC=windres.exe -F pe-i386
RC=windres.exe
endif
PROGPATH = $(O)/$(PROG)
PROGPATH_STATIC = $(O)/$(PROG)s
ifneq ($(CC), xlc)
CFLAGS_WARN_WALL = -Wall -Werror -Wextra
endif
# for object file
CFLAGS_BASE_LIST = -c
# for ASM file
# CFLAGS_BASE_LIST = -S
FLAGS_FLTO =
FLAGS_FLTO = -flto
CFLAGS_BASE = $(MY_ARCH_2) -O2 $(CFLAGS_BASE_LIST) $(CFLAGS_WARN_WALL) $(CFLAGS_WARN) \
-DNDEBUG -D_REENTRANT -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE
ifdef SystemDrive
IS_MINGW = 1
else
ifdef SYSTEMDRIVE
# ifdef OS
IS_MINGW = 1
endif
endif
ifdef IS_MINGW
LDFLAGS_STATIC_2 = -static
else
ifndef DEF_FILE
ifndef IS_NOT_STANDALONE
ifndef MY_DYNAMIC_LINK
ifneq ($(CC), clang)
LDFLAGS_STATIC_2 =
# -static
# -static-libstdc++ -static-libgcc
endif
endif
endif
endif
endif
LDFLAGS_STATIC = -DNDEBUG $(LDFLAGS_STATIC_2)
ifdef DEF_FILE
ifdef IS_MINGW
SHARED_EXT=.dll
LDFLAGS = -shared -DEF $(DEF_FILE) $(LDFLAGS_STATIC)
else
SHARED_EXT=.so
LDFLAGS = -shared -fPIC $(LDFLAGS_STATIC)
CC_SHARED=-fPIC
endif
else
LDFLAGS = $(LDFLAGS_STATIC)
# -s is not required for clang, do we need it for GGC ???
# -s
#-static -static-libgcc -static-libstdc++
ifdef IS_MINGW
SHARED_EXT=.exe
else
SHARED_EXT=
endif
endif
PROGPATH = $(O)/$(PROG)$(SHARED_EXT)
PROGPATH_STATIC = $(O)/$(PROG)s$(SHARED_EXT)
ifndef O
O=_o
endif
ifdef IS_MINGW
ifdef MSYSTEM
RM = rm -f
MY_MKDIR=mkdir -p
DEL_OBJ_EXE = -$(RM) $(PROGPATH) $(PROGPATH_STATIC) $(OBJS)
else
RM = del
MY_MKDIR=mkdir
DEL_OBJ_EXE = -$(RM) $(O)\*.o $(O)\$(PROG).exe $(O)\$(PROG).dll
endif
LIB2 = -lOle32 -loleaut32 -luuid -ladvapi32 -lUser32 -lShell32
CFLAGS_EXTRA = -DUNICODE -D_UNICODE
# -Wno-delete-non-virtual-dtor
else
RM = rm -f
MY_MKDIR=mkdir -p
# CFLAGS_BASE := $(CFLAGS_BASE) -DZ7_ST
# CFLAGS_EXTRA = -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE
# LOCAL_LIBS=-lpthread
# LOCAL_LIBS_DLL=$(LOCAL_LIBS) -ldl
LIB2 = -lpthread -ldl
DEL_OBJ_EXE = -$(RM) $(PROGPATH) $(PROGPATH_STATIC) $(OBJS)
endif
ifdef IS_X64
AFLAGS_ABI = -elf64 -DABI_LINUX
else
AFLAGS_ABI = -elf -DABI_LINUX -DABI_CDECL
# -DABI_CDECL
# -DABI_LINUX
# -DABI_CDECL
endif
AFLAGS = $(AFLAGS_ABI) -Fo$(O)/
C_WARN_FLAGS =
CFLAGS = $(LOCAL_FLAGS) $(CFLAGS_BASE2) $(CFLAGS_BASE) $(CFLAGS_EXTRA) $(C_WARN_FLAGS) $(FLAGS_FLTO) $(CC_SHARED) -o $@
STATIC_TARGET=
ifdef COMPL_STATIC
STATIC_TARGET=$(PROGPATH_STATIC)
endif
all: $(O) $(PROGPATH) $(STATIC_TARGET)
$(O):
$(MY_MKDIR) $(O)
ifneq ($(CC), $(CROSS_COMPILE)clang)
LFLAGS_STRIP = -s
endif
LFLAGS_ALL = $(LFLAGS_STRIP) $(MY_ARCH_2) $(LDFLAGS) $(FLAGS_FLTO) $(LD_arch) $(OBJS) $(MY_LIBS) $(LIB2)
$(PROGPATH): $(OBJS)
$(CC) -o $(PROGPATH) $(LFLAGS_ALL)
$(PROGPATH_STATIC): $(OBJS)
$(CC) -static -o $(PROGPATH_STATIC) $(LFLAGS_ALL)
ifndef NO_DEFAULT_RES
# old mingw without -FO
# windres.exe $(RFLAGS) resource.rc $O/resource.o
$O/resource.o: resource.rc
$(RC) $(RFLAGS) resource.rc $(O)/resource.o
endif
# windres.exe $(RFLAGS) resource.rc $(O)\resource.o
# windres.exe $(RFLAGS) resource.rc -FO $(O)/resource.o
# $(RC) $(RFLAGS) resource.rc -FO $(O)/resource.o
$O/7zAlloc.o: ../../../C/7zAlloc.c
$(CC) $(CFLAGS) $<
$O/7zArcIn.o: ../../../C/7zArcIn.c
$(CC) $(CFLAGS) $<
$O/7zBuf.o: ../../../C/7zBuf.c
$(CC) $(CFLAGS) $<
$O/7zBuf2.o: ../../../C/7zBuf2.c
$(CC) $(CFLAGS) $<
$O/7zCrc.o: ../../../C/7zCrc.c
$(CC) $(CFLAGS) $<
$O/7zDec.o: ../../../C/7zDec.c
$(CC) $(CFLAGS) $<
$O/7zFile.o: ../../../C/7zFile.c
$(CC) $(CFLAGS) $<
$O/7zStream.o: ../../../C/7zStream.c
$(CC) $(CFLAGS) $<
$O/Aes.o: ../../../C/Aes.c
$(CC) $(CFLAGS) $<
$O/Alloc.o: ../../../C/Alloc.c
$(CC) $(CFLAGS) $<
$O/Bcj2.o: ../../../C/Bcj2.c
$(CC) $(CFLAGS) $<
$O/Bcj2Enc.o: ../../../C/Bcj2Enc.c
$(CC) $(CFLAGS) $<
$O/Blake2s.o: ../../../C/Blake2s.c
$(CC) $(CFLAGS) $<
$O/Bra.o: ../../../C/Bra.c
$(CC) $(CFLAGS) $<
$O/Bra86.o: ../../../C/Bra86.c
$(CC) $(CFLAGS) $<
$O/BraIA64.o: ../../../C/BraIA64.c
$(CC) $(CFLAGS) $<
$O/BwtSort.o: ../../../C/BwtSort.c
$(CC) $(CFLAGS) $<
$O/CpuArch.o: ../../../C/CpuArch.c
$(CC) $(CFLAGS) $<
$O/Delta.o: ../../../C/Delta.c
$(CC) $(CFLAGS) $<
$O/DllSecur.o: ../../../C/DllSecur.c
$(CC) $(CFLAGS) $<
$O/HuffEnc.o: ../../../C/HuffEnc.c
$(CC) $(CFLAGS) $<
$O/LzFind.o: ../../../C/LzFind.c
$(CC) $(CFLAGS) $<
# ifdef MT_FILES
$O/LzFindMt.o: ../../../C/LzFindMt.c
$(CC) $(CFLAGS) $<
$O/LzFindOpt.o: ../../../C/LzFindOpt.c
$(CC) $(CFLAGS) $<
$O/Threads.o: ../../../C/Threads.c
$(CC) $(CFLAGS) $<
# endif
$O/LzmaEnc.o: ../../../C/LzmaEnc.c
$(CC) $(CFLAGS) $<
$O/Lzma86Dec.o: ../../../C/Lzma86Dec.c
$(CC) $(CFLAGS) $<
$O/Lzma86Enc.o: ../../../C/Lzma86Enc.c
$(CC) $(CFLAGS) $<
$O/Lzma2Dec.o: ../../../C/Lzma2Dec.c
$(CC) $(CFLAGS) $<
$O/Lzma2DecMt.o: ../../../C/Lzma2DecMt.c
$(CC) $(CFLAGS) $<
$O/Lzma2Enc.o: ../../../C/Lzma2Enc.c
$(CC) $(CFLAGS) $<
$O/LzmaLib.o: ../../../C/LzmaLib.c
$(CC) $(CFLAGS) $<
$O/MtCoder.o: ../../../C/MtCoder.c
$(CC) $(CFLAGS) $<
$O/MtDec.o: ../../../C/MtDec.c
$(CC) $(CFLAGS) $<
$O/Ppmd7.o: ../../../C/Ppmd7.c
$(CC) $(CFLAGS) $<
$O/Ppmd7aDec.o: ../../../C/Ppmd7aDec.c
$(CC) $(CFLAGS) $<
$O/Ppmd7Dec.o: ../../../C/Ppmd7Dec.c
$(CC) $(CFLAGS) $<
$O/Ppmd7Enc.o: ../../../C/Ppmd7Enc.c
$(CC) $(CFLAGS) $<
$O/Ppmd8.o: ../../../C/Ppmd8.c
$(CC) $(CFLAGS) $<
$O/Ppmd8Dec.o: ../../../C/Ppmd8Dec.c
$(CC) $(CFLAGS) $<
$O/Ppmd8Enc.o: ../../../C/Ppmd8Enc.c
$(CC) $(CFLAGS) $<
$O/Sha1.o: ../../../C/Sha1.c
$(CC) $(CFLAGS) $<
$O/Sha256.o: ../../../C/Sha256.c
$(CC) $(CFLAGS) $<
$O/Sort.o: ../../../C/Sort.c
$(CC) $(CFLAGS) $<
$O/SwapBytes.o: ../../../C/SwapBytes.c
$(CC) $(CFLAGS) $<
$O/Xz.o: ../../../C/Xz.c
$(CC) $(CFLAGS) $<
$O/XzCrc64.o: ../../../C/XzCrc64.c
$(CC) $(CFLAGS) $<
$O/XzDec.o: ../../../C/XzDec.c
$(CC) $(CFLAGS) $<
$O/XzEnc.o: ../../../C/XzEnc.c
$(CC) $(CFLAGS) $<
$O/XzIn.o: ../../../C/XzIn.c
$(CC) $(CFLAGS) $<
ifdef USE_ASM
ifdef IS_X64
USE_X86_ASM=1
else
ifdef IS_X86
USE_X86_ASM=1
endif
endif
endif
ifdef USE_X86_ASM
$O/7zCrcOpt.o: ../../../Asm/x86/7zCrcOpt.asm
$(MY_ASM) $(AFLAGS) $<
$O/XzCrc64Opt.o: ../../../Asm/x86/XzCrc64Opt.asm
$(MY_ASM) $(AFLAGS) $<
$O/AesOpt.o: ../../../Asm/x86/AesOpt.asm
$(MY_ASM) $(AFLAGS) $<
$O/Sha1Opt.o: ../../../Asm/x86/Sha1Opt.asm
$(MY_ASM) $(AFLAGS) $<
$O/Sha256Opt.o: ../../../Asm/x86/Sha256Opt.asm
$(MY_ASM) $(AFLAGS) $<
else
$O/7zCrcOpt.o: ../../7zCrcOpt.c
$(CC) $(CFLAGS) $<
$O/XzCrc64Opt.o: ../../XzCrc64Opt.c
$(CC) $(CFLAGS) $<
$O/Sha1Opt.o: ../../Sha1Opt.c
$(CC) $(CFLAGS) $<
$O/Sha256Opt.o: ../../Sha256Opt.c
$(CC) $(CFLAGS) $<
$O/AesOpt.o: ../../AesOpt.c
$(CC) $(CFLAGS) $<
endif
ifdef USE_LZMA_DEC_ASM
ifdef IS_X64
$O/LzmaDecOpt.o: ../../../Asm/x86/LzmaDecOpt.asm
$(MY_ASM) $(AFLAGS) $<
endif
ifdef IS_ARM64
$O/LzmaDecOpt.o: ../../../Asm/arm64/LzmaDecOpt.S ../../../Asm/arm64/7zAsm.S
$(CC) $(CFLAGS) $<
endif
$O/LzmaDec.o: ../../LzmaDec.c
$(CC) $(CFLAGS) -DZ7_LZMA_DEC_OPT $<
else
$O/LzmaDec.o: ../../LzmaDec.c
$(CC) $(CFLAGS) $<
endif
$O/7zMain.o: ../../../C/Util/7z/7zMain.c
$(CC) $(CFLAGS) $<
$O/7zipInstall.o: ../../../C/Util/7zipInstall/7zipInstall.c
$(CC) $(CFLAGS) $<
$O/7zipUninstall.o: ../../../C/Util/7zipUninstall/7zipUninstall.c
$(CC) $(CFLAGS) $<
$O/LzmaUtil.o: ../../../C/Util/Lzma/LzmaUtil.c
$(CC) $(CFLAGS) $<
$O/XzUtil.o: ../../../C/Util/Xz/XzUtil.c
$(CC) $(CFLAGS) $<
clean:
-$(DEL_OBJ_EXE)

375
C/Aes.c
View File

@@ -1,13 +1,20 @@
/* Aes.c -- AES encryption / decryption
2008-08-05
Igor Pavlov
Public domain */
2023-04-02 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include "Aes.h"
#include "CpuArch.h"
#include "Aes.h"
AES_CODE_FUNC g_AesCbc_Decode;
#ifndef Z7_SFX
AES_CODE_FUNC g_AesCbc_Encode;
AES_CODE_FUNC g_AesCtr_Code;
UInt32 g_Aes_SupportedFunctions_Flags;
#endif
static UInt32 T[256 * 4];
static Byte Sbox[256] = {
static const Byte Sbox[256] = {
0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5, 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76,
0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0, 0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0,
0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc, 0x34, 0xa5, 0xe5, 0xf1, 0x71, 0xd8, 0x31, 0x15,
@@ -25,11 +32,10 @@ static Byte Sbox[256] = {
0xe1, 0xf8, 0x98, 0x11, 0x69, 0xd9, 0x8e, 0x94, 0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf,
0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68, 0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16};
static UInt32 D[256 * 4];
static Byte InvS[256];
static Byte Rcon[11] = { 0x00, 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0x1b, 0x36 };
#define xtime(x) ((((x) << 1) ^ (((x) & 0x80) != 0 ? 0x1B : 0)) & 0xFF)
#define Ui32(a0, a1, a2, a3) ((UInt32)(a0) | ((UInt32)(a1) << 8) | ((UInt32)(a2) << 16) | ((UInt32)(a3) << 24))
@@ -37,120 +43,221 @@ static Byte Rcon[11] = { 0x00, 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0
#define gb0(x) ( (x) & 0xFF)
#define gb1(x) (((x) >> ( 8)) & 0xFF)
#define gb2(x) (((x) >> (16)) & 0xFF)
#define gb3(x) (((x) >> (24)) & 0xFF)
#define gb3(x) (((x) >> (24)))
#define gb(n, x) gb ## n(x)
#define TT(x) (T + (x << 8))
#define DD(x) (D + (x << 8))
// #define Z7_SHOW_AES_STATUS
#ifdef MY_CPU_X86_OR_AMD64
#define USE_HW_AES
#elif defined(MY_CPU_ARM_OR_ARM64) && defined(MY_CPU_LE)
#if defined(__clang__)
#if (__clang_major__ >= 8) // fix that check
#define USE_HW_AES
#endif
#elif defined(__GNUC__)
#if (__GNUC__ >= 6) // fix that check
#define USE_HW_AES
#endif
#elif defined(_MSC_VER)
#if _MSC_VER >= 1910
#define USE_HW_AES
#endif
#endif
#endif
#ifdef USE_HW_AES
#ifdef Z7_SHOW_AES_STATUS
#include <stdio.h>
#define PRF(x) x
#else
#define PRF(x)
#endif
#endif
void AesGenTables(void)
{
unsigned i;
for (i = 0; i < 256; i++)
InvS[Sbox[i]] = (Byte)i;
for (i = 0; i < 256; i++)
{
{
UInt32 a1 = Sbox[i];
UInt32 a2 = xtime(a1);
UInt32 a3 = xtime(a1) ^ a1;
T[ i] = Ui32(a2, a1, a1, a3);
T[0x100 + i] = Ui32(a3, a2, a1, a1);
T[0x200 + i] = Ui32(a1, a3, a2, a1);
T[0x300 + i] = Ui32(a1, a1, a3, a2);
const UInt32 a1 = Sbox[i];
const UInt32 a2 = xtime(a1);
const UInt32 a3 = a2 ^ a1;
TT(0)[i] = Ui32(a2, a1, a1, a3);
TT(1)[i] = Ui32(a3, a2, a1, a1);
TT(2)[i] = Ui32(a1, a3, a2, a1);
TT(3)[i] = Ui32(a1, a1, a3, a2);
}
{
UInt32 a1 = InvS[i];
UInt32 a2 = xtime(a1);
UInt32 a4 = xtime(a2);
UInt32 a8 = xtime(a4);
UInt32 a9 = a8 ^ a1;
UInt32 aB = a8 ^ a2 ^ a1;
UInt32 aD = a8 ^ a4 ^ a1;
UInt32 aE = a8 ^ a4 ^ a2;
D[ i] = Ui32(aE, a9, aD, aB);
D[0x100 + i] = Ui32(aB, aE, a9, aD);
D[0x200 + i] = Ui32(aD, aB, aE, a9);
D[0x300 + i] = Ui32(a9, aD, aB, aE);
const UInt32 a1 = InvS[i];
const UInt32 a2 = xtime(a1);
const UInt32 a4 = xtime(a2);
const UInt32 a8 = xtime(a4);
const UInt32 a9 = a8 ^ a1;
const UInt32 aB = a8 ^ a2 ^ a1;
const UInt32 aD = a8 ^ a4 ^ a1;
const UInt32 aE = a8 ^ a4 ^ a2;
DD(0)[i] = Ui32(aE, a9, aD, aB);
DD(1)[i] = Ui32(aB, aE, a9, aD);
DD(2)[i] = Ui32(aD, aB, aE, a9);
DD(3)[i] = Ui32(a9, aD, aB, aE);
}
}
{
AES_CODE_FUNC d = AesCbc_Decode;
#ifndef Z7_SFX
AES_CODE_FUNC e = AesCbc_Encode;
AES_CODE_FUNC c = AesCtr_Code;
UInt32 flags = 0;
#endif
#ifdef USE_HW_AES
if (CPU_IsSupported_AES())
{
// #pragma message ("AES HW")
PRF(printf("\n===AES HW\n"));
d = AesCbc_Decode_HW;
#ifndef Z7_SFX
e = AesCbc_Encode_HW;
c = AesCtr_Code_HW;
flags = k_Aes_SupportedFunctions_HW;
#endif
#ifdef MY_CPU_X86_OR_AMD64
if (CPU_IsSupported_VAES_AVX2())
{
PRF(printf("\n===vaes avx2\n"));
d = AesCbc_Decode_HW_256;
#ifndef Z7_SFX
c = AesCtr_Code_HW_256;
flags |= k_Aes_SupportedFunctions_HW_256;
#endif
}
#endif
}
#endif
g_AesCbc_Decode = d;
#ifndef Z7_SFX
g_AesCbc_Encode = e;
g_AesCtr_Code = c;
g_Aes_SupportedFunctions_Flags = flags;
#endif
}
}
#define HT(i, x, s) (T + (x << 8))[gb ## x(s[(i + x) & 3])]
#define HT(i, x, s) TT(x)[gb(x, s[(i + x) & 3])]
#define HT4(m, i, s, p) m[i] = \
HT(i, 0, s) ^ \
HT(i, 1, s) ^ \
HT(i, 2, s) ^ \
HT(i, 3, s) ^ w[p + i]
/* such order (2031) in HT16 is for VC6/K8 speed optimization) */
#define HT16(m, s, p) \
HT4(m, 2, s, p); \
HT4(m, 0, s, p); \
HT4(m, 3, s, p); \
HT4(m, 1, s, p); \
#define FT(i, x) Sbox[gb ## x(m[(i + x) & 3])]
#define HT16(m, s, p) \
HT4(m, 0, s, p); \
HT4(m, 1, s, p); \
HT4(m, 2, s, p); \
HT4(m, 3, s, p); \
#define FT(i, x) Sbox[gb(x, m[(i + x) & 3])]
#define FT4(i) dest[i] = Ui32(FT(i, 0), FT(i, 1), FT(i, 2), FT(i, 3)) ^ w[i];
#define HD(i, x, s) (D + (x << 8))[gb ## x(s[(i - x) & 3])]
#define HD(i, x, s) DD(x)[gb(x, s[(i - x) & 3])]
#define HD4(m, i, s, p) m[i] = \
HD(i, 0, s) ^ \
HD(i, 1, s) ^ \
HD(i, 2, s) ^ \
HD(i, 3, s) ^ w[p + i];
/* such order (0231) in HD16 is for VC6/K8 speed optimization) */
#define HD16(m, s, p) \
HD4(m, 0, s, p); \
HD4(m, 1, s, p); \
HD4(m, 2, s, p); \
HD4(m, 3, s, p); \
HD4(m, 1, s, p); \
#define FD(i, x) InvS[gb ## x(m[(i - x) & 3])]
#define FD(i, x) InvS[gb(x, m[(i - x) & 3])]
#define FD4(i) dest[i] = Ui32(FD(i, 0), FD(i, 1), FD(i, 2), FD(i, 3)) ^ w[i];
void Aes_SetKeyEncode(CAes *p, const Byte *key, unsigned keySize)
void Z7_FASTCALL Aes_SetKey_Enc(UInt32 *w, const Byte *key, unsigned keySize)
{
unsigned i, wSize;
UInt32 *w;
unsigned i, m;
const UInt32 *wLim;
UInt32 t;
UInt32 rcon = 1;
keySize /= 4;
p->numRounds2 = keySize / 2 + 3;
wSize = (p->numRounds2 * 2 + 1) * 4;
w = p->rkey;
w[0] = ((UInt32)keySize / 2) + 3;
w += 4;
for (i = 0; i < keySize; i++, key += 4)
w[i] = Ui32(key[0], key[1], key[2], key[3]);
w[i] = GetUi32(key);
for (; i < wSize; i++)
t = w[(size_t)keySize - 1];
wLim = w + (size_t)keySize * 3 + 28;
m = 0;
do
{
UInt32 t = w[i - 1];
unsigned rem = i % keySize;
if (rem == 0)
t = Ui32(Sbox[gb1(t)] ^ Rcon[i / keySize], Sbox[gb2(t)], Sbox[gb3(t)], Sbox[gb0(t)]);
else if (keySize > 6 && rem == 4)
if (m == 0)
{
t = Ui32(Sbox[gb1(t)] ^ rcon, Sbox[gb2(t)], Sbox[gb3(t)], Sbox[gb0(t)]);
rcon <<= 1;
if (rcon & 0x100)
rcon = 0x1b;
m = keySize;
}
else if (m == 4 && keySize > 6)
t = Ui32(Sbox[gb0(t)], Sbox[gb1(t)], Sbox[gb2(t)], Sbox[gb3(t)]);
w[i] = w[i - keySize] ^ t;
m--;
t ^= w[0];
w[keySize] = t;
}
while (++w != wLim);
}
void Aes_SetKeyDecode(CAes *p, const Byte *key, unsigned keySize)
void Z7_FASTCALL Aes_SetKey_Dec(UInt32 *w, const Byte *key, unsigned keySize)
{
unsigned i, num;
UInt32 *w;
Aes_SetKeyEncode(p, key, keySize);
num = p->numRounds2 * 8 - 4;
w = p->rkey + 4;
Aes_SetKey_Enc(w, key, keySize);
num = keySize + 20;
w += 8;
for (i = 0; i < num; i++)
{
UInt32 r = w[i];
w[i] =
D[ Sbox[gb0(r)]] ^
D[0x100 + Sbox[gb1(r)]] ^
D[0x200 + Sbox[gb2(r)]] ^
D[0x300 + Sbox[gb3(r)]];
DD(0)[Sbox[gb0(r)]] ^
DD(1)[Sbox[gb1(r)]] ^
DD(2)[Sbox[gb2(r)]] ^
DD(3)[Sbox[gb3(r)]];
}
}
static void AesEncode32(UInt32 *dest, const UInt32 *src, const UInt32 *w, unsigned numRounds2)
/* Aes_Encode and Aes_Decode functions work with little-endian words.
src and dest are pointers to 4 UInt32 words.
src and dest can point to same block */
// Z7_FORCE_INLINE
static void Aes_Encode(const UInt32 *w, UInt32 *dest, const UInt32 *src)
{
UInt32 s[4];
UInt32 m[4];
UInt32 numRounds2 = w[0];
w += 4;
s[0] = src[0] ^ w[0];
s[1] = src[1] ^ w[1];
s[2] = src[2] ^ w[2];
@@ -158,21 +265,26 @@ static void AesEncode32(UInt32 *dest, const UInt32 *src, const UInt32 *w, unsign
w += 4;
for (;;)
{
HT16(m, s, 0);
HT16(m, s, 0)
if (--numRounds2 == 0)
break;
HT16(s, m, 4);
HT16(s, m, 4)
w += 8;
}
w += 4;
FT4(0); FT4(1); FT4(2); FT4(3);
FT4(0)
FT4(1)
FT4(2)
FT4(3)
}
static void AesDecode32(UInt32 *dest, const UInt32 *src, const UInt32 *w, unsigned numRounds2)
Z7_FORCE_INLINE
static void Aes_Decode(const UInt32 *w, UInt32 *dest, const UInt32 *src)
{
UInt32 s[4];
UInt32 m[4];
w += numRounds2 * 8;
UInt32 numRounds2 = w[0];
w += 4 + numRounds2 * 8;
s[0] = src[0] ^ w[0];
s[1] = src[1] ^ w[1];
s[2] = src[2] ^ w[2];
@@ -180,83 +292,102 @@ static void AesDecode32(UInt32 *dest, const UInt32 *src, const UInt32 *w, unsign
for (;;)
{
w -= 8;
HD16(m, s, 4);
HD16(m, s, 4)
if (--numRounds2 == 0)
break;
HD16(s, m, 0);
HD16(s, m, 0)
}
FD4(0); FD4(1); FD4(2); FD4(3);
FD4(0)
FD4(1)
FD4(2)
FD4(3)
}
void Aes_Encode32(const CAes *p, UInt32 *dest, const UInt32 *src)
{
AesEncode32(dest, src, p->rkey, p->numRounds2);
}
void Aes_Decode32(const CAes *p, UInt32 *dest, const UInt32 *src)
{
AesDecode32(dest, src, p->rkey, p->numRounds2);
}
void AesCbc_Init(CAesCbc *p, const Byte *iv)
void AesCbc_Init(UInt32 *p, const Byte *iv)
{
unsigned i;
for (i = 0; i < 4; i++)
p->prev[i] = GetUi32(iv + i * 4);
p[i] = GetUi32(iv + i * 4);
}
SizeT AesCbc_Encode(CAesCbc *p, Byte *data, SizeT size)
void Z7_FASTCALL AesCbc_Encode(UInt32 *p, Byte *data, size_t numBlocks)
{
SizeT i;
if (size == 0)
return 0;
if (size < AES_BLOCK_SIZE)
return AES_BLOCK_SIZE;
size -= AES_BLOCK_SIZE;
for (i = 0; i <= size; i += AES_BLOCK_SIZE, data += AES_BLOCK_SIZE)
for (; numBlocks != 0; numBlocks--, data += AES_BLOCK_SIZE)
{
p->prev[0] ^= GetUi32(data);
p->prev[1] ^= GetUi32(data + 4);
p->prev[2] ^= GetUi32(data + 8);
p->prev[3] ^= GetUi32(data + 12);
p[0] ^= GetUi32(data);
p[1] ^= GetUi32(data + 4);
p[2] ^= GetUi32(data + 8);
p[3] ^= GetUi32(data + 12);
AesEncode32(p->prev, p->prev, p->aes.rkey, p->aes.numRounds2);
Aes_Encode(p + 4, p, p);
SetUi32(data, p->prev[0]);
SetUi32(data + 4, p->prev[1]);
SetUi32(data + 8, p->prev[2]);
SetUi32(data + 12, p->prev[3]);
SetUi32(data, p[0])
SetUi32(data + 4, p[1])
SetUi32(data + 8, p[2])
SetUi32(data + 12, p[3])
}
return i;
}
SizeT AesCbc_Decode(CAesCbc *p, Byte *data, SizeT size)
void Z7_FASTCALL AesCbc_Decode(UInt32 *p, Byte *data, size_t numBlocks)
{
SizeT i;
UInt32 in[4], out[4];
if (size == 0)
return 0;
if (size < AES_BLOCK_SIZE)
return AES_BLOCK_SIZE;
size -= AES_BLOCK_SIZE;
for (i = 0; i <= size; i += AES_BLOCK_SIZE, data += AES_BLOCK_SIZE)
for (; numBlocks != 0; numBlocks--, data += AES_BLOCK_SIZE)
{
in[0] = GetUi32(data);
in[1] = GetUi32(data + 4);
in[2] = GetUi32(data + 8);
in[3] = GetUi32(data + 12);
Aes_Decode(p + 4, out, in);
SetUi32(data, p[0] ^ out[0])
SetUi32(data + 4, p[1] ^ out[1])
SetUi32(data + 8, p[2] ^ out[2])
SetUi32(data + 12, p[3] ^ out[3])
AesDecode32(out, in, p->aes.rkey, p->aes.numRounds2);
SetUi32(data, p->prev[0] ^ out[0]);
SetUi32(data + 4, p->prev[1] ^ out[1]);
SetUi32(data + 8, p->prev[2] ^ out[2]);
SetUi32(data + 12, p->prev[3] ^ out[3]);
p->prev[0] = in[0];
p->prev[1] = in[1];
p->prev[2] = in[2];
p->prev[3] = in[3];
p[0] = in[0];
p[1] = in[1];
p[2] = in[2];
p[3] = in[3];
}
return i;
}
void Z7_FASTCALL AesCtr_Code(UInt32 *p, Byte *data, size_t numBlocks)
{
for (; numBlocks != 0; numBlocks--)
{
UInt32 temp[4];
unsigned i;
if (++p[0] == 0)
p[1]++;
Aes_Encode(p + 4, temp, p);
for (i = 0; i < 4; i++, data += 4)
{
const UInt32 t = temp[i];
#ifdef MY_CPU_LE_UNALIGN
*((UInt32 *)(void *)data) ^= t;
#else
data[0] = (Byte)(data[0] ^ (t & 0xFF));
data[1] = (Byte)(data[1] ^ ((t >> 8) & 0xFF));
data[2] = (Byte)(data[2] ^ ((t >> 16) & 0xFF));
data[3] = (Byte)(data[3] ^ ((t >> 24)));
#endif
}
}
}
#undef xtime
#undef Ui32
#undef gb0
#undef gb1
#undef gb2
#undef gb3
#undef gb
#undef TT
#undef DD
#undef USE_HW_AES
#undef PRF

74
C/Aes.h
View File

@@ -1,48 +1,60 @@
/* Aes.h -- AES encryption / decryption
2008-08-05
Igor Pavlov
Public domain */
2023-04-02 : Igor Pavlov : Public domain */
#ifndef __AES_H
#define __AES_H
#ifndef ZIP7_INC_AES_H
#define ZIP7_INC_AES_H
#include "Types.h"
#include "7zTypes.h"
EXTERN_C_BEGIN
#define AES_BLOCK_SIZE 16
typedef struct
{
unsigned numRounds2; /* = numRounds / 2 */
UInt32 rkey[(14 + 1) * 4];
} CAes;
/* Call AesGenTables one time before other AES functions */
void AesGenTables(void);
/* UInt32 pointers must be 16-byte aligned */
/* 16-byte (4 * 32-bit words) blocks: 1 (IV) + 1 (keyMode) + 15 (AES-256 roundKeys) */
#define AES_NUM_IVMRK_WORDS ((1 + 1 + 15) * 4)
/* aes - 16-byte aligned pointer to keyMode+roundKeys sequence */
/* keySize = 16 or 24 or 32 (bytes) */
void Aes_SetKeyEncode(CAes *p, const Byte *key, unsigned keySize);
void Aes_SetKeyDecode(CAes *p, const Byte *key, unsigned keySize);
typedef void (Z7_FASTCALL *AES_SET_KEY_FUNC)(UInt32 *aes, const Byte *key, unsigned keySize);
void Z7_FASTCALL Aes_SetKey_Enc(UInt32 *aes, const Byte *key, unsigned keySize);
void Z7_FASTCALL Aes_SetKey_Dec(UInt32 *aes, const Byte *key, unsigned keySize);
/* Aes_Encode32 and Aes_Decode32 functions work with little-endian words.
src and dest are pointers to 4 UInt32 words.
arc and dest can point to same block */
void Aes_Encode32(const CAes *p, UInt32 *dest, const UInt32 *src);
void Aes_Decode32(const CAes *p, UInt32 *dest, const UInt32 *src);
/* ivAes - 16-byte aligned pointer to iv+keyMode+roundKeys sequence: UInt32[AES_NUM_IVMRK_WORDS] */
void AesCbc_Init(UInt32 *ivAes, const Byte *iv); /* iv size is AES_BLOCK_SIZE */
typedef struct
{
UInt32 prev[4];
CAes aes;
} CAesCbc;
/* data - 16-byte aligned pointer to data */
/* numBlocks - the number of 16-byte blocks in data array */
typedef void (Z7_FASTCALL *AES_CODE_FUNC)(UInt32 *ivAes, Byte *data, size_t numBlocks);
void AesCbc_Init(CAesCbc *p, const Byte *iv); /* iv size is AES_BLOCK_SIZE */
extern AES_CODE_FUNC g_AesCbc_Decode;
#ifndef Z7_SFX
extern AES_CODE_FUNC g_AesCbc_Encode;
extern AES_CODE_FUNC g_AesCtr_Code;
#define k_Aes_SupportedFunctions_HW (1 << 2)
#define k_Aes_SupportedFunctions_HW_256 (1 << 3)
extern UInt32 g_Aes_SupportedFunctions_Flags;
#endif
/* AesCbc_Encode and AesCbc_Decode:
if (res <= size): Filter have converted res bytes
if (res > size): Filter have not converted anything. And it needs at
least res = AES_BLOCK_SIZE bytes to convert one block */
SizeT AesCbc_Encode(CAesCbc *p, Byte *data, SizeT size);
SizeT AesCbc_Decode(CAesCbc *p, Byte *data, SizeT size);
#define Z7_DECLARE_AES_CODE_FUNC(funcName) \
void Z7_FASTCALL funcName(UInt32 *ivAes, Byte *data, size_t numBlocks);
Z7_DECLARE_AES_CODE_FUNC (AesCbc_Encode)
Z7_DECLARE_AES_CODE_FUNC (AesCbc_Decode)
Z7_DECLARE_AES_CODE_FUNC (AesCtr_Code)
Z7_DECLARE_AES_CODE_FUNC (AesCbc_Encode_HW)
Z7_DECLARE_AES_CODE_FUNC (AesCbc_Decode_HW)
Z7_DECLARE_AES_CODE_FUNC (AesCtr_Code_HW)
Z7_DECLARE_AES_CODE_FUNC (AesCbc_Decode_HW_256)
Z7_DECLARE_AES_CODE_FUNC (AesCtr_Code_HW_256)
EXTERN_C_END
#endif

840
C/AesOpt.c Executable file
View File

@@ -0,0 +1,840 @@
/* AesOpt.c -- AES optimized code for x86 AES hardware instructions
2023-04-02 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include "Aes.h"
#include "CpuArch.h"
#ifdef MY_CPU_X86_OR_AMD64
#if defined(__INTEL_COMPILER)
#if (__INTEL_COMPILER >= 1110)
#define USE_INTEL_AES
#if (__INTEL_COMPILER >= 1900)
#define USE_INTEL_VAES
#endif
#endif
#elif defined(__clang__) && (__clang_major__ > 3 || __clang_major__ == 3 && __clang_minor__ >= 8) \
|| defined(__GNUC__) && (__GNUC__ > 4 || __GNUC__ == 4 && __GNUC_MINOR__ >= 4)
#define USE_INTEL_AES
#if !defined(__AES__)
#define ATTRIB_AES __attribute__((__target__("aes")))
#endif
#if defined(__clang__) && (__clang_major__ >= 8) \
|| defined(__GNUC__) && (__GNUC__ >= 8)
#define USE_INTEL_VAES
#if !defined(__AES__) || !defined(__VAES__) || !defined(__AVX__) || !defined(__AVX2__)
#define ATTRIB_VAES __attribute__((__target__("aes,vaes,avx,avx2")))
#endif
#endif
#elif defined(_MSC_VER)
#if (_MSC_VER > 1500) || (_MSC_FULL_VER >= 150030729)
#define USE_INTEL_AES
#if (_MSC_VER >= 1910)
#define USE_INTEL_VAES
#endif
#endif
#endif
#ifndef ATTRIB_AES
#define ATTRIB_AES
#endif
#ifndef ATTRIB_VAES
#define ATTRIB_VAES
#endif
#ifdef USE_INTEL_AES
#include <wmmintrin.h>
#ifndef USE_INTEL_VAES
#define AES_TYPE_keys UInt32
#define AES_TYPE_data Byte
// #define AES_TYPE_keys __m128i
// #define AES_TYPE_data __m128i
#endif
#define AES_FUNC_START(name) \
void Z7_FASTCALL name(UInt32 *ivAes, Byte *data8, size_t numBlocks)
// void Z7_FASTCALL name(__m128i *p, __m128i *data, size_t numBlocks)
#define AES_FUNC_START2(name) \
AES_FUNC_START (name); \
ATTRIB_AES \
AES_FUNC_START (name)
#define MM_OP(op, dest, src) dest = op(dest, src);
#define MM_OP_m(op, src) MM_OP(op, m, src)
#define MM_XOR( dest, src) MM_OP(_mm_xor_si128, dest, src)
#define AVX_XOR(dest, src) MM_OP(_mm256_xor_si256, dest, src)
AES_FUNC_START2 (AesCbc_Encode_HW)
{
__m128i *p = (__m128i *)(void *)ivAes;
__m128i *data = (__m128i *)(void *)data8;
__m128i m = *p;
const __m128i k0 = p[2];
const __m128i k1 = p[3];
const UInt32 numRounds2 = *(const UInt32 *)(p + 1) - 1;
for (; numBlocks != 0; numBlocks--, data++)
{
UInt32 r = numRounds2;
const __m128i *w = p + 4;
__m128i temp = *data;
MM_XOR (temp, k0)
MM_XOR (m, temp)
MM_OP_m (_mm_aesenc_si128, k1)
do
{
MM_OP_m (_mm_aesenc_si128, w[0])
MM_OP_m (_mm_aesenc_si128, w[1])
w += 2;
}
while (--r);
MM_OP_m (_mm_aesenclast_si128, w[0])
*data = m;
}
*p = m;
}
#define WOP_1(op)
#define WOP_2(op) WOP_1 (op) op (m1, 1)
#define WOP_3(op) WOP_2 (op) op (m2, 2)
#define WOP_4(op) WOP_3 (op) op (m3, 3)
#ifdef MY_CPU_AMD64
#define WOP_5(op) WOP_4 (op) op (m4, 4)
#define WOP_6(op) WOP_5 (op) op (m5, 5)
#define WOP_7(op) WOP_6 (op) op (m6, 6)
#define WOP_8(op) WOP_7 (op) op (m7, 7)
#endif
/*
#define WOP_9(op) WOP_8 (op) op (m8, 8);
#define WOP_10(op) WOP_9 (op) op (m9, 9);
#define WOP_11(op) WOP_10(op) op (m10, 10);
#define WOP_12(op) WOP_11(op) op (m11, 11);
#define WOP_13(op) WOP_12(op) op (m12, 12);
#define WOP_14(op) WOP_13(op) op (m13, 13);
*/
#ifdef MY_CPU_AMD64
#define NUM_WAYS 8
#define WOP_M1 WOP_8
#else
#define NUM_WAYS 4
#define WOP_M1 WOP_4
#endif
#define WOP(op) op (m0, 0) WOP_M1(op)
#define DECLARE_VAR(reg, ii) __m128i reg;
#define LOAD_data( reg, ii) reg = data[ii];
#define STORE_data( reg, ii) data[ii] = reg;
#if (NUM_WAYS > 1)
#define XOR_data_M1(reg, ii) MM_XOR (reg, data[ii- 1])
#endif
#define AVX_DECLARE_VAR(reg, ii) __m256i reg;
#define AVX_LOAD_data( reg, ii) reg = ((const __m256i *)(const void *)data)[ii];
#define AVX_STORE_data( reg, ii) ((__m256i *)(void *)data)[ii] = reg;
#define AVX_XOR_data_M1(reg, ii) AVX_XOR (reg, (((const __m256i *)(const void *)(data - 1))[ii]))
#define MM_OP_key(op, reg) MM_OP(op, reg, key);
#define AES_DEC( reg, ii) MM_OP_key (_mm_aesdec_si128, reg)
#define AES_DEC_LAST( reg, ii) MM_OP_key (_mm_aesdeclast_si128, reg)
#define AES_ENC( reg, ii) MM_OP_key (_mm_aesenc_si128, reg)
#define AES_ENC_LAST( reg, ii) MM_OP_key (_mm_aesenclast_si128, reg)
#define AES_XOR( reg, ii) MM_OP_key (_mm_xor_si128, reg)
#define AVX_AES_DEC( reg, ii) MM_OP_key (_mm256_aesdec_epi128, reg)
#define AVX_AES_DEC_LAST( reg, ii) MM_OP_key (_mm256_aesdeclast_epi128, reg)
#define AVX_AES_ENC( reg, ii) MM_OP_key (_mm256_aesenc_epi128, reg)
#define AVX_AES_ENC_LAST( reg, ii) MM_OP_key (_mm256_aesenclast_epi128, reg)
#define AVX_AES_XOR( reg, ii) MM_OP_key (_mm256_xor_si256, reg)
#define CTR_START(reg, ii) MM_OP (_mm_add_epi64, ctr, one) reg = ctr;
#define CTR_END( reg, ii) MM_XOR (data[ii], reg)
#define AVX_CTR_START(reg, ii) MM_OP (_mm256_add_epi64, ctr2, two) reg = _mm256_xor_si256(ctr2, key);
#define AVX_CTR_END( reg, ii) AVX_XOR (((__m256i *)(void *)data)[ii], reg)
#define WOP_KEY(op, n) { \
const __m128i key = w[n]; \
WOP(op); }
#define AVX_WOP_KEY(op, n) { \
const __m256i key = w[n]; \
WOP(op); }
#define WIDE_LOOP_START \
dataEnd = data + numBlocks; \
if (numBlocks >= NUM_WAYS) \
{ dataEnd -= NUM_WAYS; do { \
#define WIDE_LOOP_END \
data += NUM_WAYS; \
} while (data <= dataEnd); \
dataEnd += NUM_WAYS; } \
#define SINGLE_LOOP \
for (; data < dataEnd; data++)
#define NUM_AES_KEYS_MAX 15
#define WIDE_LOOP_START_AVX(OP) \
dataEnd = data + numBlocks; \
if (numBlocks >= NUM_WAYS * 2) \
{ __m256i keys[NUM_AES_KEYS_MAX]; \
UInt32 ii; \
OP \
for (ii = 0; ii < numRounds; ii++) \
keys[ii] = _mm256_broadcastsi128_si256(p[ii]); \
dataEnd -= NUM_WAYS * 2; do { \
#define WIDE_LOOP_END_AVX(OP) \
data += NUM_WAYS * 2; \
} while (data <= dataEnd); \
dataEnd += NUM_WAYS * 2; \
OP \
_mm256_zeroupper(); \
} \
/* MSVC for x86: If we don't call _mm256_zeroupper(), and -arch:IA32 is not specified,
MSVC still can insert vzeroupper instruction. */
AES_FUNC_START2 (AesCbc_Decode_HW)
{
__m128i *p = (__m128i *)(void *)ivAes;
__m128i *data = (__m128i *)(void *)data8;
__m128i iv = *p;
const __m128i *wStart = p + *(const UInt32 *)(p + 1) * 2 + 2 - 1;
const __m128i *dataEnd;
p += 2;
WIDE_LOOP_START
{
const __m128i *w = wStart;
WOP (DECLARE_VAR)
WOP (LOAD_data)
WOP_KEY (AES_XOR, 1)
do
{
WOP_KEY (AES_DEC, 0)
w--;
}
while (w != p);
WOP_KEY (AES_DEC_LAST, 0)
MM_XOR (m0, iv)
WOP_M1 (XOR_data_M1)
iv = data[NUM_WAYS - 1];
WOP (STORE_data)
}
WIDE_LOOP_END
SINGLE_LOOP
{
const __m128i *w = wStart - 1;
__m128i m = _mm_xor_si128 (w[2], *data);
do
{
MM_OP_m (_mm_aesdec_si128, w[1])
MM_OP_m (_mm_aesdec_si128, w[0])
w -= 2;
}
while (w != p);
MM_OP_m (_mm_aesdec_si128, w[1])
MM_OP_m (_mm_aesdeclast_si128, w[0])
MM_XOR (m, iv)
iv = *data;
*data = m;
}
p[-2] = iv;
}
AES_FUNC_START2 (AesCtr_Code_HW)
{
__m128i *p = (__m128i *)(void *)ivAes;
__m128i *data = (__m128i *)(void *)data8;
__m128i ctr = *p;
UInt32 numRoundsMinus2 = *(const UInt32 *)(p + 1) * 2 - 1;
const __m128i *dataEnd;
__m128i one = _mm_cvtsi32_si128(1);
p += 2;
WIDE_LOOP_START
{
const __m128i *w = p;
UInt32 r = numRoundsMinus2;
WOP (DECLARE_VAR)
WOP (CTR_START)
WOP_KEY (AES_XOR, 0)
w += 1;
do
{
WOP_KEY (AES_ENC, 0)
w += 1;
}
while (--r);
WOP_KEY (AES_ENC_LAST, 0)
WOP (CTR_END)
}
WIDE_LOOP_END
SINGLE_LOOP
{
UInt32 numRounds2 = *(const UInt32 *)(p - 2 + 1) - 1;
const __m128i *w = p;
__m128i m;
MM_OP (_mm_add_epi64, ctr, one)
m = _mm_xor_si128 (ctr, p[0]);
w += 1;
do
{
MM_OP_m (_mm_aesenc_si128, w[0])
MM_OP_m (_mm_aesenc_si128, w[1])
w += 2;
}
while (--numRounds2);
MM_OP_m (_mm_aesenc_si128, w[0])
MM_OP_m (_mm_aesenclast_si128, w[1])
MM_XOR (*data, m)
}
p[-2] = ctr;
}
#ifdef USE_INTEL_VAES
/*
GCC before 2013-Jun:
<immintrin.h>:
#ifdef __AVX__
#include <avxintrin.h>
#endif
GCC after 2013-Jun:
<immintrin.h>:
#include <avxintrin.h>
CLANG 3.8+:
{
<immintrin.h>:
#if !defined(_MSC_VER) || defined(__AVX__)
#include <avxintrin.h>
#endif
if (the compiler is clang for Windows and if global arch is not set for __AVX__)
[ if (defined(_MSC_VER) && !defined(__AVX__)) ]
{
<immintrin.h> doesn't include <avxintrin.h>
and we have 2 ways to fix it:
1) we can define required __AVX__ before <immintrin.h>
or
2) we can include <avxintrin.h> after <immintrin.h>
}
}
If we include <avxintrin.h> manually for GCC/CLANG, it's
required that <immintrin.h> must be included before <avxintrin.h>.
*/
/*
#if defined(__clang__) && defined(_MSC_VER)
#define __AVX__
#define __AVX2__
#define __VAES__
#endif
*/
#include <immintrin.h>
#if defined(__clang__) && defined(_MSC_VER)
#if !defined(__AVX__)
#include <avxintrin.h>
#endif
#if !defined(__AVX2__)
#include <avx2intrin.h>
#endif
#if !defined(__VAES__)
#include <vaesintrin.h>
#endif
#endif // __clang__ && _MSC_VER
#define VAES_FUNC_START2(name) \
AES_FUNC_START (name); \
ATTRIB_VAES \
AES_FUNC_START (name)
VAES_FUNC_START2 (AesCbc_Decode_HW_256)
{
__m128i *p = (__m128i *)(void *)ivAes;
__m128i *data = (__m128i *)(void *)data8;
__m128i iv = *p;
const __m128i *dataEnd;
UInt32 numRounds = *(const UInt32 *)(p + 1) * 2 + 1;
p += 2;
WIDE_LOOP_START_AVX(;)
{
const __m256i *w = keys + numRounds - 2;
WOP (AVX_DECLARE_VAR)
WOP (AVX_LOAD_data)
AVX_WOP_KEY (AVX_AES_XOR, 1)
do
{
AVX_WOP_KEY (AVX_AES_DEC, 0)
w--;
}
while (w != keys);
AVX_WOP_KEY (AVX_AES_DEC_LAST, 0)
AVX_XOR (m0, _mm256_setr_m128i(iv, data[0]))
WOP_M1 (AVX_XOR_data_M1)
iv = data[NUM_WAYS * 2 - 1];
WOP (AVX_STORE_data)
}
WIDE_LOOP_END_AVX(;)
SINGLE_LOOP
{
const __m128i *w = p + *(const UInt32 *)(p + 1 - 2) * 2 + 1 - 3;
__m128i m = _mm_xor_si128 (w[2], *data);
do
{
MM_OP_m (_mm_aesdec_si128, w[1])
MM_OP_m (_mm_aesdec_si128, w[0])
w -= 2;
}
while (w != p);
MM_OP_m (_mm_aesdec_si128, w[1])
MM_OP_m (_mm_aesdeclast_si128, w[0])
MM_XOR (m, iv)
iv = *data;
*data = m;
}
p[-2] = iv;
}
/*
SSE2: _mm_cvtsi32_si128 : movd
AVX: _mm256_setr_m128i : vinsertf128
AVX2: _mm256_add_epi64 : vpaddq ymm, ymm, ymm
_mm256_extracti128_si256 : vextracti128
_mm256_broadcastsi128_si256 : vbroadcasti128
*/
#define AVX_CTR_LOOP_START \
ctr2 = _mm256_setr_m128i(_mm_sub_epi64(ctr, one), ctr); \
two = _mm256_setr_m128i(one, one); \
two = _mm256_add_epi64(two, two); \
// two = _mm256_setr_epi64x(2, 0, 2, 0);
#define AVX_CTR_LOOP_ENC \
ctr = _mm256_extracti128_si256 (ctr2, 1); \
VAES_FUNC_START2 (AesCtr_Code_HW_256)
{
__m128i *p = (__m128i *)(void *)ivAes;
__m128i *data = (__m128i *)(void *)data8;
__m128i ctr = *p;
UInt32 numRounds = *(const UInt32 *)(p + 1) * 2 + 1;
const __m128i *dataEnd;
__m128i one = _mm_cvtsi32_si128(1);
__m256i ctr2, two;
p += 2;
WIDE_LOOP_START_AVX (AVX_CTR_LOOP_START)
{
const __m256i *w = keys;
UInt32 r = numRounds - 2;
WOP (AVX_DECLARE_VAR)
AVX_WOP_KEY (AVX_CTR_START, 0)
w += 1;
do
{
AVX_WOP_KEY (AVX_AES_ENC, 0)
w += 1;
}
while (--r);
AVX_WOP_KEY (AVX_AES_ENC_LAST, 0)
WOP (AVX_CTR_END)
}
WIDE_LOOP_END_AVX (AVX_CTR_LOOP_ENC)
SINGLE_LOOP
{
UInt32 numRounds2 = *(const UInt32 *)(p - 2 + 1) - 1;
const __m128i *w = p;
__m128i m;
MM_OP (_mm_add_epi64, ctr, one)
m = _mm_xor_si128 (ctr, p[0]);
w += 1;
do
{
MM_OP_m (_mm_aesenc_si128, w[0])
MM_OP_m (_mm_aesenc_si128, w[1])
w += 2;
}
while (--numRounds2);
MM_OP_m (_mm_aesenc_si128, w[0])
MM_OP_m (_mm_aesenclast_si128, w[1])
MM_XOR (*data, m)
}
p[-2] = ctr;
}
#endif // USE_INTEL_VAES
#else // USE_INTEL_AES
/* no USE_INTEL_AES */
#pragma message("AES HW_SW stub was used")
#define AES_TYPE_keys UInt32
#define AES_TYPE_data Byte
#define AES_FUNC_START(name) \
void Z7_FASTCALL name(UInt32 *p, Byte *data, size_t numBlocks) \
#define AES_COMPAT_STUB(name) \
AES_FUNC_START(name); \
AES_FUNC_START(name ## _HW) \
{ name(p, data, numBlocks); }
AES_COMPAT_STUB (AesCbc_Encode)
AES_COMPAT_STUB (AesCbc_Decode)
AES_COMPAT_STUB (AesCtr_Code)
#endif // USE_INTEL_AES
#ifndef USE_INTEL_VAES
#pragma message("VAES HW_SW stub was used")
#define VAES_COMPAT_STUB(name) \
void Z7_FASTCALL name ## _256(UInt32 *p, Byte *data, size_t numBlocks); \
void Z7_FASTCALL name ## _256(UInt32 *p, Byte *data, size_t numBlocks) \
{ name((AES_TYPE_keys *)(void *)p, (AES_TYPE_data *)(void *)data, numBlocks); }
VAES_COMPAT_STUB (AesCbc_Decode_HW)
VAES_COMPAT_STUB (AesCtr_Code_HW)
#endif // ! USE_INTEL_VAES
#elif defined(MY_CPU_ARM_OR_ARM64) && defined(MY_CPU_LE)
#if defined(__clang__)
#if (__clang_major__ >= 8) // fix that check
#define USE_HW_AES
#endif
#elif defined(__GNUC__)
#if (__GNUC__ >= 6) // fix that check
#define USE_HW_AES
#endif
#elif defined(_MSC_VER)
#if _MSC_VER >= 1910
#define USE_HW_AES
#endif
#endif
#ifdef USE_HW_AES
// #pragma message("=== AES HW === ")
#if defined(__clang__) || defined(__GNUC__)
#ifdef MY_CPU_ARM64
#define ATTRIB_AES __attribute__((__target__("+crypto")))
#else
#define ATTRIB_AES __attribute__((__target__("fpu=crypto-neon-fp-armv8")))
#endif
#else
// _MSC_VER
// for arm32
#define _ARM_USE_NEW_NEON_INTRINSICS
#endif
#ifndef ATTRIB_AES
#define ATTRIB_AES
#endif
#if defined(_MSC_VER) && defined(MY_CPU_ARM64)
#include <arm64_neon.h>
#else
#include <arm_neon.h>
#endif
typedef uint8x16_t v128;
#define AES_FUNC_START(name) \
void Z7_FASTCALL name(UInt32 *ivAes, Byte *data8, size_t numBlocks)
// void Z7_FASTCALL name(v128 *p, v128 *data, size_t numBlocks)
#define AES_FUNC_START2(name) \
AES_FUNC_START (name); \
ATTRIB_AES \
AES_FUNC_START (name)
#define MM_OP(op, dest, src) dest = op(dest, src);
#define MM_OP_m(op, src) MM_OP(op, m, src)
#define MM_OP1_m(op) m = op(m);
#define MM_XOR( dest, src) MM_OP(veorq_u8, dest, src)
#define MM_XOR_m( src) MM_XOR(m, src)
#define AES_E_m(k) MM_OP_m (vaeseq_u8, k)
#define AES_E_MC_m(k) AES_E_m (k) MM_OP1_m(vaesmcq_u8)
AES_FUNC_START2 (AesCbc_Encode_HW)
{
v128 *p = (v128*)(void*)ivAes;
v128 *data = (v128*)(void*)data8;
v128 m = *p;
const v128 k0 = p[2];
const v128 k1 = p[3];
const v128 k2 = p[4];
const v128 k3 = p[5];
const v128 k4 = p[6];
const v128 k5 = p[7];
const v128 k6 = p[8];
const v128 k7 = p[9];
const v128 k8 = p[10];
const v128 k9 = p[11];
const UInt32 numRounds2 = *(const UInt32 *)(p + 1);
const v128 *w = p + ((size_t)numRounds2 * 2);
const v128 k_z1 = w[1];
const v128 k_z0 = w[2];
for (; numBlocks != 0; numBlocks--, data++)
{
MM_XOR_m (*data);
AES_E_MC_m (k0)
AES_E_MC_m (k1)
AES_E_MC_m (k2)
AES_E_MC_m (k3)
AES_E_MC_m (k4)
AES_E_MC_m (k5)
AES_E_MC_m (k6)
AES_E_MC_m (k7)
AES_E_MC_m (k8)
if (numRounds2 >= 6)
{
AES_E_MC_m (k9)
AES_E_MC_m (p[12])
if (numRounds2 != 6)
{
AES_E_MC_m (p[13])
AES_E_MC_m (p[14])
}
}
AES_E_m (k_z1)
MM_XOR_m (k_z0);
*data = m;
}
*p = m;
}
#define WOP_1(op)
#define WOP_2(op) WOP_1 (op) op (m1, 1)
#define WOP_3(op) WOP_2 (op) op (m2, 2)
#define WOP_4(op) WOP_3 (op) op (m3, 3)
#define WOP_5(op) WOP_4 (op) op (m4, 4)
#define WOP_6(op) WOP_5 (op) op (m5, 5)
#define WOP_7(op) WOP_6 (op) op (m6, 6)
#define WOP_8(op) WOP_7 (op) op (m7, 7)
#define NUM_WAYS 8
#define WOP_M1 WOP_8
#define WOP(op) op (m0, 0) WOP_M1(op)
#define DECLARE_VAR(reg, ii) v128 reg;
#define LOAD_data( reg, ii) reg = data[ii];
#define STORE_data( reg, ii) data[ii] = reg;
#if (NUM_WAYS > 1)
#define XOR_data_M1(reg, ii) MM_XOR (reg, data[ii- 1])
#endif
#define MM_OP_key(op, reg) MM_OP (op, reg, key)
#define AES_D_m(k) MM_OP_m (vaesdq_u8, k)
#define AES_D_IMC_m(k) AES_D_m (k) MM_OP1_m (vaesimcq_u8)
#define AES_XOR( reg, ii) MM_OP_key (veorq_u8, reg)
#define AES_D( reg, ii) MM_OP_key (vaesdq_u8, reg)
#define AES_E( reg, ii) MM_OP_key (vaeseq_u8, reg)
#define AES_D_IMC( reg, ii) AES_D (reg, ii) reg = vaesimcq_u8(reg);
#define AES_E_MC( reg, ii) AES_E (reg, ii) reg = vaesmcq_u8(reg);
#define CTR_START(reg, ii) MM_OP (vaddq_u64, ctr, one) reg = vreinterpretq_u8_u64(ctr);
#define CTR_END( reg, ii) MM_XOR (data[ii], reg)
#define WOP_KEY(op, n) { \
const v128 key = w[n]; \
WOP(op) }
#define WIDE_LOOP_START \
dataEnd = data + numBlocks; \
if (numBlocks >= NUM_WAYS) \
{ dataEnd -= NUM_WAYS; do { \
#define WIDE_LOOP_END \
data += NUM_WAYS; \
} while (data <= dataEnd); \
dataEnd += NUM_WAYS; } \
#define SINGLE_LOOP \
for (; data < dataEnd; data++)
AES_FUNC_START2 (AesCbc_Decode_HW)
{
v128 *p = (v128*)(void*)ivAes;
v128 *data = (v128*)(void*)data8;
v128 iv = *p;
const v128 *wStart = p + ((size_t)*(const UInt32 *)(p + 1)) * 2;
const v128 *dataEnd;
p += 2;
WIDE_LOOP_START
{
const v128 *w = wStart;
WOP (DECLARE_VAR)
WOP (LOAD_data)
WOP_KEY (AES_D_IMC, 2)
do
{
WOP_KEY (AES_D_IMC, 1)
WOP_KEY (AES_D_IMC, 0)
w -= 2;
}
while (w != p);
WOP_KEY (AES_D, 1)
WOP_KEY (AES_XOR, 0)
MM_XOR (m0, iv);
WOP_M1 (XOR_data_M1)
iv = data[NUM_WAYS - 1];
WOP (STORE_data)
}
WIDE_LOOP_END
SINGLE_LOOP
{
const v128 *w = wStart;
v128 m = *data;
AES_D_IMC_m (w[2])
do
{
AES_D_IMC_m (w[1]);
AES_D_IMC_m (w[0]);
w -= 2;
}
while (w != p);
AES_D_m (w[1]);
MM_XOR_m (w[0]);
MM_XOR_m (iv);
iv = *data;
*data = m;
}
p[-2] = iv;
}
AES_FUNC_START2 (AesCtr_Code_HW)
{
v128 *p = (v128*)(void*)ivAes;
v128 *data = (v128*)(void*)data8;
uint64x2_t ctr = vreinterpretq_u64_u8(*p);
const v128 *wEnd = p + ((size_t)*(const UInt32 *)(p + 1)) * 2;
const v128 *dataEnd;
uint64x2_t one = vdupq_n_u64(0);
one = vsetq_lane_u64(1, one, 0);
p += 2;
WIDE_LOOP_START
{
const v128 *w = p;
WOP (DECLARE_VAR)
WOP (CTR_START)
do
{
WOP_KEY (AES_E_MC, 0)
WOP_KEY (AES_E_MC, 1)
w += 2;
}
while (w != wEnd);
WOP_KEY (AES_E_MC, 0)
WOP_KEY (AES_E, 1)
WOP_KEY (AES_XOR, 2)
WOP (CTR_END)
}
WIDE_LOOP_END
SINGLE_LOOP
{
const v128 *w = p;
v128 m;
CTR_START (m, 0);
do
{
AES_E_MC_m (w[0]);
AES_E_MC_m (w[1]);
w += 2;
}
while (w != wEnd);
AES_E_MC_m (w[0])
AES_E_m (w[1])
MM_XOR_m (w[2])
CTR_END (m, 0)
}
p[-2] = vreinterpretq_u8_u64(ctr);
}
#endif // USE_HW_AES
#endif // MY_CPU_ARM_OR_ARM64
#undef NUM_WAYS
#undef WOP_M1
#undef WOP
#undef DECLARE_VAR
#undef LOAD_data
#undef STORE_data
#undef USE_INTEL_AES
#undef USE_HW_AES

528
C/Alloc.c
View File

@@ -1,33 +1,182 @@
/* Alloc.c -- Memory allocation functions
2008-09-24
Igor Pavlov
Public domain */
2023-04-02 : Igor Pavlov : Public domain */
#include "Precomp.h"
#ifdef _WIN32
#include <windows.h>
#include "7zWindows.h"
#endif
#include <stdlib.h>
#include "Alloc.h"
/* #define _SZ_ALLOC_DEBUG */
/* use _SZ_ALLOC_DEBUG to debug alloc/free operations */
#ifdef _SZ_ALLOC_DEBUG
#include <stdio.h>
int g_allocCount = 0;
int g_allocCountMid = 0;
int g_allocCountBig = 0;
#ifdef _WIN32
#ifdef Z7_LARGE_PAGES
#if defined(__clang__) || defined(__GNUC__)
typedef void (*Z7_voidFunction)(void);
#define MY_CAST_FUNC (Z7_voidFunction)
#elif defined(_MSC_VER) && _MSC_VER > 1920
#define MY_CAST_FUNC (void *)
// #pragma warning(disable : 4191) // 'type cast': unsafe conversion from 'FARPROC' to 'void (__cdecl *)()'
#else
#define MY_CAST_FUNC
#endif
#endif // Z7_LARGE_PAGES
#endif // _WIN32
// #define SZ_ALLOC_DEBUG
/* #define SZ_ALLOC_DEBUG */
/* use SZ_ALLOC_DEBUG to debug alloc/free operations */
#ifdef SZ_ALLOC_DEBUG
#include <string.h>
#include <stdio.h>
static int g_allocCount = 0;
#ifdef _WIN32
static int g_allocCountMid = 0;
static int g_allocCountBig = 0;
#endif
#define CONVERT_INT_TO_STR(charType, tempSize) \
char temp[tempSize]; unsigned i = 0; \
while (val >= 10) { temp[i++] = (char)('0' + (unsigned)(val % 10)); val /= 10; } \
*s++ = (charType)('0' + (unsigned)val); \
while (i != 0) { i--; *s++ = temp[i]; } \
*s = 0;
static void ConvertUInt64ToString(UInt64 val, char *s)
{
CONVERT_INT_TO_STR(char, 24)
}
#define GET_HEX_CHAR(t) ((char)(((t < 10) ? ('0' + t) : ('A' + (t - 10)))))
static void ConvertUInt64ToHex(UInt64 val, char *s)
{
UInt64 v = val;
unsigned i;
for (i = 1;; i++)
{
v >>= 4;
if (v == 0)
break;
}
s[i] = 0;
do
{
unsigned t = (unsigned)(val & 0xF);
val >>= 4;
s[--i] = GET_HEX_CHAR(t);
}
while (i);
}
#define DEBUG_OUT_STREAM stderr
static void Print(const char *s)
{
fputs(s, DEBUG_OUT_STREAM);
}
static void PrintAligned(const char *s, size_t align)
{
size_t len = strlen(s);
for(;;)
{
fputc(' ', DEBUG_OUT_STREAM);
if (len >= align)
break;
++len;
}
Print(s);
}
static void PrintLn(void)
{
Print("\n");
}
static void PrintHex(UInt64 v, size_t align)
{
char s[32];
ConvertUInt64ToHex(v, s);
PrintAligned(s, align);
}
static void PrintDec(int v, size_t align)
{
char s[32];
ConvertUInt64ToString((unsigned)v, s);
PrintAligned(s, align);
}
static void PrintAddr(void *p)
{
PrintHex((UInt64)(size_t)(ptrdiff_t)p, 12);
}
#define PRINT_REALLOC(name, cnt, size, ptr) { \
Print(name " "); \
if (!ptr) PrintDec(cnt++, 10); \
PrintHex(size, 10); \
PrintAddr(ptr); \
PrintLn(); }
#define PRINT_ALLOC(name, cnt, size, ptr) { \
Print(name " "); \
PrintDec(cnt++, 10); \
PrintHex(size, 10); \
PrintAddr(ptr); \
PrintLn(); }
#define PRINT_FREE(name, cnt, ptr) if (ptr) { \
Print(name " "); \
PrintDec(--cnt, 10); \
PrintAddr(ptr); \
PrintLn(); }
#else
#ifdef _WIN32
#define PRINT_ALLOC(name, cnt, size, ptr)
#endif
#define PRINT_FREE(name, cnt, ptr)
#define Print(s)
#define PrintLn()
#define PrintHex(v, align)
#define PrintAddr(p)
#endif
/*
by specification:
malloc(non_NULL, 0) : returns NULL or a unique pointer value that can later be successfully passed to free()
realloc(NULL, size) : the call is equivalent to malloc(size)
realloc(non_NULL, 0) : the call is equivalent to free(ptr)
in main compilers:
malloc(0) : returns non_NULL
realloc(NULL, 0) : returns non_NULL
realloc(non_NULL, 0) : returns NULL
*/
void *MyAlloc(size_t size)
{
if (size == 0)
return 0;
#ifdef _SZ_ALLOC_DEBUG
return NULL;
// PRINT_ALLOC("Alloc ", g_allocCount, size, NULL)
#ifdef SZ_ALLOC_DEBUG
{
void *p = malloc(size);
fprintf(stderr, "\nAlloc %10d bytes, count = %10d, addr = %8X", size, g_allocCount++, (unsigned)p);
if (p)
{
PRINT_ALLOC("Alloc ", g_allocCount, size, p)
}
return p;
}
#else
@@ -37,91 +186,350 @@ void *MyAlloc(size_t size)
void MyFree(void *address)
{
#ifdef _SZ_ALLOC_DEBUG
if (address != 0)
fprintf(stderr, "\nFree; count = %10d, addr = %8X", --g_allocCount, (unsigned)address);
#endif
PRINT_FREE("Free ", g_allocCount, address)
free(address);
}
void *MyRealloc(void *address, size_t size)
{
if (size == 0)
{
MyFree(address);
return NULL;
}
// PRINT_REALLOC("Realloc ", g_allocCount, size, address)
#ifdef SZ_ALLOC_DEBUG
{
void *p = realloc(address, size);
if (p)
{
PRINT_REALLOC("Realloc ", g_allocCount, size, address)
}
return p;
}
#else
return realloc(address, size);
#endif
}
#ifdef _WIN32
void *MidAlloc(size_t size)
{
if (size == 0)
return 0;
#ifdef _SZ_ALLOC_DEBUG
fprintf(stderr, "\nAlloc_Mid %10d bytes; count = %10d", size, g_allocCountMid++);
return NULL;
#ifdef SZ_ALLOC_DEBUG
{
void *p = VirtualAlloc(NULL, size, MEM_COMMIT, PAGE_READWRITE);
if (p)
{
PRINT_ALLOC("Alloc-Mid", g_allocCountMid, size, p)
}
return p;
}
#else
return VirtualAlloc(NULL, size, MEM_COMMIT, PAGE_READWRITE);
#endif
return VirtualAlloc(0, size, MEM_COMMIT, PAGE_READWRITE);
}
void MidFree(void *address)
{
#ifdef _SZ_ALLOC_DEBUG
if (address != 0)
fprintf(stderr, "\nFree_Mid; count = %10d", --g_allocCountMid);
#endif
if (address == 0)
PRINT_FREE("Free-Mid", g_allocCountMid, address)
if (!address)
return;
VirtualFree(address, 0, MEM_RELEASE);
}
#ifndef MEM_LARGE_PAGES
#undef _7ZIP_LARGE_PAGES
#ifdef Z7_LARGE_PAGES
#ifdef MEM_LARGE_PAGES
#define MY__MEM_LARGE_PAGES MEM_LARGE_PAGES
#else
#define MY__MEM_LARGE_PAGES 0x20000000
#endif
#ifdef _7ZIP_LARGE_PAGES
extern
SIZE_T g_LargePageSize;
SIZE_T g_LargePageSize = 0;
typedef SIZE_T (WINAPI *GetLargePageMinimumP)();
#endif
typedef SIZE_T (WINAPI *Func_GetLargePageMinimum)(VOID);
void SetLargePageSize()
void SetLargePageSize(void)
{
#ifdef _7ZIP_LARGE_PAGES
SIZE_T size = 0;
GetLargePageMinimumP largePageMinimum = (GetLargePageMinimumP)
GetProcAddress(GetModuleHandle(TEXT("kernel32.dll")), "GetLargePageMinimum");
if (largePageMinimum == 0)
#ifdef Z7_LARGE_PAGES
SIZE_T size;
const
Func_GetLargePageMinimum fn =
(Func_GetLargePageMinimum) MY_CAST_FUNC GetProcAddress(GetModuleHandle(TEXT("kernel32.dll")),
"GetLargePageMinimum");
if (!fn)
return;
size = largePageMinimum();
size = fn();
if (size == 0 || (size & (size - 1)) != 0)
return;
g_LargePageSize = size;
#endif
}
#endif // Z7_LARGE_PAGES
void *BigAlloc(size_t size)
{
if (size == 0)
return 0;
#ifdef _SZ_ALLOC_DEBUG
fprintf(stderr, "\nAlloc_Big %10d bytes; count = %10d", size, g_allocCountBig++);
#endif
#ifdef _7ZIP_LARGE_PAGES
if (g_LargePageSize != 0 && g_LargePageSize <= (1 << 30) && size >= (1 << 18))
return NULL;
PRINT_ALLOC("Alloc-Big", g_allocCountBig, size, NULL)
#ifdef Z7_LARGE_PAGES
{
void *res = VirtualAlloc(0, (size + g_LargePageSize - 1) & (~(g_LargePageSize - 1)),
MEM_COMMIT | MEM_LARGE_PAGES, PAGE_READWRITE);
if (res != 0)
return res;
SIZE_T ps = g_LargePageSize;
if (ps != 0 && ps <= (1 << 30) && size > (ps / 2))
{
size_t size2;
ps--;
size2 = (size + ps) & ~ps;
if (size2 >= size)
{
void *p = VirtualAlloc(NULL, size2, MEM_COMMIT | MY__MEM_LARGE_PAGES, PAGE_READWRITE);
if (p)
{
PRINT_ALLOC("Alloc-BM ", g_allocCountMid, size2, p)
return p;
}
}
}
}
#endif
return VirtualAlloc(0, size, MEM_COMMIT, PAGE_READWRITE);
return MidAlloc(size);
}
void BigFree(void *address)
{
#ifdef _SZ_ALLOC_DEBUG
if (address != 0)
fprintf(stderr, "\nFree_Big; count = %10d", --g_allocCountBig);
#endif
if (address == 0)
return;
VirtualFree(address, 0, MEM_RELEASE);
PRINT_FREE("Free-Big", g_allocCountBig, address)
MidFree(address);
}
#endif // _WIN32
static void *SzAlloc(ISzAllocPtr p, size_t size) { UNUSED_VAR(p) return MyAlloc(size); }
static void SzFree(ISzAllocPtr p, void *address) { UNUSED_VAR(p) MyFree(address); }
const ISzAlloc g_Alloc = { SzAlloc, SzFree };
#ifdef _WIN32
static void *SzMidAlloc(ISzAllocPtr p, size_t size) { UNUSED_VAR(p) return MidAlloc(size); }
static void SzMidFree(ISzAllocPtr p, void *address) { UNUSED_VAR(p) MidFree(address); }
static void *SzBigAlloc(ISzAllocPtr p, size_t size) { UNUSED_VAR(p) return BigAlloc(size); }
static void SzBigFree(ISzAllocPtr p, void *address) { UNUSED_VAR(p) BigFree(address); }
const ISzAlloc g_MidAlloc = { SzMidAlloc, SzMidFree };
const ISzAlloc g_BigAlloc = { SzBigAlloc, SzBigFree };
#endif
/*
uintptr_t : <stdint.h> C99 (optional)
: unsupported in VS6
*/
#ifdef _WIN32
typedef UINT_PTR UIntPtr;
#else
/*
typedef uintptr_t UIntPtr;
*/
typedef ptrdiff_t UIntPtr;
#endif
#define ADJUST_ALLOC_SIZE 0
/*
#define ADJUST_ALLOC_SIZE (sizeof(void *) - 1)
*/
/*
Use (ADJUST_ALLOC_SIZE = (sizeof(void *) - 1)), if
MyAlloc() can return address that is NOT multiple of sizeof(void *).
*/
/*
#define MY_ALIGN_PTR_DOWN(p, align) ((void *)((char *)(p) - ((size_t)(UIntPtr)(p) & ((align) - 1))))
*/
#define MY_ALIGN_PTR_DOWN(p, align) ((void *)((((UIntPtr)(p)) & ~((UIntPtr)(align) - 1))))
#if !defined(_WIN32) && defined(_POSIX_C_SOURCE) && (_POSIX_C_SOURCE >= 200112L)
#define USE_posix_memalign
#endif
#ifndef USE_posix_memalign
#define MY_ALIGN_PTR_UP_PLUS(p, align) MY_ALIGN_PTR_DOWN(((char *)(p) + (align) + ADJUST_ALLOC_SIZE), align)
#endif
/*
This posix_memalign() is for test purposes only.
We also need special Free() function instead of free(),
if this posix_memalign() is used.
*/
/*
static int posix_memalign(void **ptr, size_t align, size_t size)
{
size_t newSize = size + align;
void *p;
void *pAligned;
*ptr = NULL;
if (newSize < size)
return 12; // ENOMEM
p = MyAlloc(newSize);
if (!p)
return 12; // ENOMEM
pAligned = MY_ALIGN_PTR_UP_PLUS(p, align);
((void **)pAligned)[-1] = p;
*ptr = pAligned;
return 0;
}
*/
/*
ALLOC_ALIGN_SIZE >= sizeof(void *)
ALLOC_ALIGN_SIZE >= cache_line_size
*/
#define ALLOC_ALIGN_SIZE ((size_t)1 << 7)
static void *SzAlignedAlloc(ISzAllocPtr pp, size_t size)
{
#ifndef USE_posix_memalign
void *p;
void *pAligned;
size_t newSize;
UNUSED_VAR(pp)
/* also we can allocate additional dummy ALLOC_ALIGN_SIZE bytes after aligned
block to prevent cache line sharing with another allocated blocks */
newSize = size + ALLOC_ALIGN_SIZE * 1 + ADJUST_ALLOC_SIZE;
if (newSize < size)
return NULL;
p = MyAlloc(newSize);
if (!p)
return NULL;
pAligned = MY_ALIGN_PTR_UP_PLUS(p, ALLOC_ALIGN_SIZE);
Print(" size="); PrintHex(size, 8);
Print(" a_size="); PrintHex(newSize, 8);
Print(" ptr="); PrintAddr(p);
Print(" a_ptr="); PrintAddr(pAligned);
PrintLn();
((void **)pAligned)[-1] = p;
return pAligned;
#else
void *p;
UNUSED_VAR(pp)
if (posix_memalign(&p, ALLOC_ALIGN_SIZE, size))
return NULL;
Print(" posix_memalign="); PrintAddr(p);
PrintLn();
return p;
#endif
}
static void SzAlignedFree(ISzAllocPtr pp, void *address)
{
UNUSED_VAR(pp)
#ifndef USE_posix_memalign
if (address)
MyFree(((void **)address)[-1]);
#else
free(address);
#endif
}
const ISzAlloc g_AlignedAlloc = { SzAlignedAlloc, SzAlignedFree };
#define MY_ALIGN_PTR_DOWN_1(p) MY_ALIGN_PTR_DOWN(p, sizeof(void *))
/* we align ptr to support cases where CAlignOffsetAlloc::offset is not multiply of sizeof(void *) */
#define REAL_BLOCK_PTR_VAR(p) ((void **)MY_ALIGN_PTR_DOWN_1(p))[-1]
/*
#define REAL_BLOCK_PTR_VAR(p) ((void **)(p))[-1]
*/
static void *AlignOffsetAlloc_Alloc(ISzAllocPtr pp, size_t size)
{
const CAlignOffsetAlloc *p = Z7_CONTAINER_FROM_VTBL_CONST(pp, CAlignOffsetAlloc, vt);
void *adr;
void *pAligned;
size_t newSize;
size_t extra;
size_t alignSize = (size_t)1 << p->numAlignBits;
if (alignSize < sizeof(void *))
alignSize = sizeof(void *);
if (p->offset >= alignSize)
return NULL;
/* also we can allocate additional dummy ALLOC_ALIGN_SIZE bytes after aligned
block to prevent cache line sharing with another allocated blocks */
extra = p->offset & (sizeof(void *) - 1);
newSize = size + alignSize + extra + ADJUST_ALLOC_SIZE;
if (newSize < size)
return NULL;
adr = ISzAlloc_Alloc(p->baseAlloc, newSize);
if (!adr)
return NULL;
pAligned = (char *)MY_ALIGN_PTR_DOWN((char *)adr +
alignSize - p->offset + extra + ADJUST_ALLOC_SIZE, alignSize) + p->offset;
PrintLn();
Print("- Aligned: ");
Print(" size="); PrintHex(size, 8);
Print(" a_size="); PrintHex(newSize, 8);
Print(" ptr="); PrintAddr(adr);
Print(" a_ptr="); PrintAddr(pAligned);
PrintLn();
REAL_BLOCK_PTR_VAR(pAligned) = adr;
return pAligned;
}
static void AlignOffsetAlloc_Free(ISzAllocPtr pp, void *address)
{
if (address)
{
const CAlignOffsetAlloc *p = Z7_CONTAINER_FROM_VTBL_CONST(pp, CAlignOffsetAlloc, vt);
PrintLn();
Print("- Aligned Free: ");
PrintLn();
ISzAlloc_Free(p->baseAlloc, REAL_BLOCK_PTR_VAR(address));
}
}
void AlignOffsetAlloc_CreateVTable(CAlignOffsetAlloc *p)
{
p->vt.Alloc = AlignOffsetAlloc_Alloc;
p->vt.Free = AlignOffsetAlloc_Free;
}

View File

@@ -1,19 +1,32 @@
/* Alloc.h -- Memory allocation functions
2008-03-13
Igor Pavlov
Public domain */
2023-03-04 : Igor Pavlov : Public domain */
#ifndef __COMMON_ALLOC_H
#define __COMMON_ALLOC_H
#ifndef ZIP7_INC_ALLOC_H
#define ZIP7_INC_ALLOC_H
#include <stddef.h>
#include "7zTypes.h"
EXTERN_C_BEGIN
/*
MyFree(NULL) : is allowed, as free(NULL)
MyAlloc(0) : returns NULL : but malloc(0) is allowed to return NULL or non_NULL
MyRealloc(NULL, 0) : returns NULL : but realloc(NULL, 0) is allowed to return NULL or non_NULL
MyRealloc() is similar to realloc() for the following cases:
MyRealloc(non_NULL, 0) : returns NULL and always calls MyFree(ptr)
MyRealloc(NULL, non_ZERO) : returns NULL, if allocation failed
MyRealloc(non_NULL, non_ZERO) : returns NULL, if reallocation failed
*/
void *MyAlloc(size_t size);
void MyFree(void *address);
void *MyRealloc(void *address, size_t size);
#ifdef _WIN32
void SetLargePageSize();
#ifdef Z7_LARGE_PAGES
void SetLargePageSize(void);
#endif
void *MidAlloc(size_t size);
void MidFree(void *address);
@@ -29,4 +42,30 @@ void BigFree(void *address);
#endif
extern const ISzAlloc g_Alloc;
#ifdef _WIN32
extern const ISzAlloc g_BigAlloc;
extern const ISzAlloc g_MidAlloc;
#else
#define g_BigAlloc g_AlignedAlloc
#define g_MidAlloc g_AlignedAlloc
#endif
extern const ISzAlloc g_AlignedAlloc;
typedef struct
{
ISzAlloc vt;
ISzAllocPtr baseAlloc;
unsigned numAlignBits; /* ((1 << numAlignBits) >= sizeof(void *)) */
size_t offset; /* (offset == (k * sizeof(void *)) && offset < (1 << numAlignBits) */
} CAlignOffsetAlloc;
void AlignOffsetAlloc_CreateVTable(CAlignOffsetAlloc *p);
EXTERN_C_END
#endif

View File

@@ -1,77 +0,0 @@
/* 7zAlloc.c -- Allocation functions
2008-10-04 : Igor Pavlov : Public domain */
#include <stdlib.h>
#include "7zAlloc.h"
/* #define _SZ_ALLOC_DEBUG */
/* use _SZ_ALLOC_DEBUG to debug alloc/free operations */
#ifdef _SZ_ALLOC_DEBUG
#ifdef _WIN32
#include <windows.h>
#endif
#include <stdio.h>
int g_allocCount = 0;
int g_allocCountTemp = 0;
#endif
void *SzAlloc(void *p, size_t size)
{
p = p;
if (size == 0)
return 0;
#ifdef _SZ_ALLOC_DEBUG
fprintf(stderr, "\nAlloc %10d bytes; count = %10d", size, g_allocCount);
g_allocCount++;
#endif
return malloc(size);
}
void SzFree(void *p, void *address)
{
p = p;
#ifdef _SZ_ALLOC_DEBUG
if (address != 0)
{
g_allocCount--;
fprintf(stderr, "\nFree; count = %10d", g_allocCount);
}
#endif
free(address);
}
void *SzAllocTemp(void *p, size_t size)
{
p = p;
if (size == 0)
return 0;
#ifdef _SZ_ALLOC_DEBUG
fprintf(stderr, "\nAlloc_temp %10d bytes; count = %10d", size, g_allocCountTemp);
g_allocCountTemp++;
#ifdef _WIN32
return HeapAlloc(GetProcessHeap(), 0, size);
#endif
#endif
return malloc(size);
}
void SzFreeTemp(void *p, void *address)
{
p = p;
#ifdef _SZ_ALLOC_DEBUG
if (address != 0)
{
g_allocCountTemp--;
fprintf(stderr, "\nFree_temp; count = %10d", g_allocCountTemp);
}
#ifdef _WIN32
HeapFree(GetProcessHeap(), 0, address);
return;
#endif
#endif
free(address);
}

View File

@@ -1,15 +0,0 @@
/* 7zAlloc.h -- Allocation functions
2008-10-04 : Igor Pavlov : Public domain */
#ifndef __7Z_ALLOC_H
#define __7Z_ALLOC_H
#include <stddef.h>
void *SzAlloc(void *p, size_t size);
void SzFree(void *p, void *address);
void *SzAllocTemp(void *p, size_t size);
void SzFreeTemp(void *p, void *address);
#endif

View File

@@ -1,254 +0,0 @@
/* 7zDecode.c -- Decoding from 7z folder
2008-11-23 : Igor Pavlov : Public domain */
#include <string.h>
#include "../../Bcj2.h"
#include "../../Bra.h"
#include "../../LzmaDec.h"
#include "7zDecode.h"
#define k_Copy 0
#define k_LZMA 0x30101
#define k_BCJ 0x03030103
#define k_BCJ2 0x0303011B
static SRes SzDecodeLzma(CSzCoderInfo *coder, UInt64 inSize, ILookInStream *inStream,
Byte *outBuffer, SizeT outSize, ISzAlloc *allocMain)
{
CLzmaDec state;
SRes res = SZ_OK;
LzmaDec_Construct(&state);
RINOK(LzmaDec_AllocateProbs(&state, coder->Props.data, (unsigned)coder->Props.size, allocMain));
state.dic = outBuffer;
state.dicBufSize = outSize;
LzmaDec_Init(&state);
for (;;)
{
Byte *inBuf = NULL;
size_t lookahead = (1 << 18);
if (lookahead > inSize)
lookahead = (size_t)inSize;
res = inStream->Look((void *)inStream, (void **)&inBuf, &lookahead);
if (res != SZ_OK)
break;
{
SizeT inProcessed = (SizeT)lookahead, dicPos = state.dicPos;
ELzmaStatus status;
res = LzmaDec_DecodeToDic(&state, outSize, inBuf, &inProcessed, LZMA_FINISH_END, &status);
lookahead -= inProcessed;
inSize -= inProcessed;
if (res != SZ_OK)
break;
if (state.dicPos == state.dicBufSize || (inProcessed == 0 && dicPos == state.dicPos))
{
if (state.dicBufSize != outSize || lookahead != 0 ||
(status != LZMA_STATUS_FINISHED_WITH_MARK &&
status != LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK))
res = SZ_ERROR_DATA;
break;
}
res = inStream->Skip((void *)inStream, inProcessed);
if (res != SZ_OK)
break;
}
}
LzmaDec_FreeProbs(&state, allocMain);
return res;
}
static SRes SzDecodeCopy(UInt64 inSize, ILookInStream *inStream, Byte *outBuffer)
{
while (inSize > 0)
{
void *inBuf;
size_t curSize = (1 << 18);
if (curSize > inSize)
curSize = (size_t)inSize;
RINOK(inStream->Look((void *)inStream, (void **)&inBuf, &curSize));
if (curSize == 0)
return SZ_ERROR_INPUT_EOF;
memcpy(outBuffer, inBuf, curSize);
outBuffer += curSize;
inSize -= curSize;
RINOK(inStream->Skip((void *)inStream, curSize));
}
return SZ_OK;
}
#define IS_UNSUPPORTED_METHOD(m) ((m) != k_Copy && (m) != k_LZMA)
#define IS_UNSUPPORTED_CODER(c) (IS_UNSUPPORTED_METHOD(c.MethodID) || c.NumInStreams != 1 || c.NumOutStreams != 1)
#define IS_NO_BCJ(c) (c.MethodID != k_BCJ || c.NumInStreams != 1 || c.NumOutStreams != 1)
#define IS_NO_BCJ2(c) (c.MethodID != k_BCJ2 || c.NumInStreams != 4 || c.NumOutStreams != 1)
SRes CheckSupportedFolder(const CSzFolder *f)
{
if (f->NumCoders < 1 || f->NumCoders > 4)
return SZ_ERROR_UNSUPPORTED;
if (IS_UNSUPPORTED_CODER(f->Coders[0]))
return SZ_ERROR_UNSUPPORTED;
if (f->NumCoders == 1)
{
if (f->NumPackStreams != 1 || f->PackStreams[0] != 0 || f->NumBindPairs != 0)
return SZ_ERROR_UNSUPPORTED;
return SZ_OK;
}
if (f->NumCoders == 2)
{
if (IS_NO_BCJ(f->Coders[1]) ||
f->NumPackStreams != 1 || f->PackStreams[0] != 0 ||
f->NumBindPairs != 1 ||
f->BindPairs[0].InIndex != 1 || f->BindPairs[0].OutIndex != 0)
return SZ_ERROR_UNSUPPORTED;
return SZ_OK;
}
if (f->NumCoders == 4)
{
if (IS_UNSUPPORTED_CODER(f->Coders[1]) ||
IS_UNSUPPORTED_CODER(f->Coders[2]) ||
IS_NO_BCJ2(f->Coders[3]))
return SZ_ERROR_UNSUPPORTED;
if (f->NumPackStreams != 4 ||
f->PackStreams[0] != 2 ||
f->PackStreams[1] != 6 ||
f->PackStreams[2] != 1 ||
f->PackStreams[3] != 0 ||
f->NumBindPairs != 3 ||
f->BindPairs[0].InIndex != 5 || f->BindPairs[0].OutIndex != 0 ||
f->BindPairs[1].InIndex != 4 || f->BindPairs[1].OutIndex != 1 ||
f->BindPairs[2].InIndex != 3 || f->BindPairs[2].OutIndex != 2)
return SZ_ERROR_UNSUPPORTED;
return SZ_OK;
}
return SZ_ERROR_UNSUPPORTED;
}
UInt64 GetSum(const UInt64 *values, UInt32 index)
{
UInt64 sum = 0;
UInt32 i;
for (i = 0; i < index; i++)
sum += values[i];
return sum;
}
SRes SzDecode2(const UInt64 *packSizes, const CSzFolder *folder,
ILookInStream *inStream, UInt64 startPos,
Byte *outBuffer, SizeT outSize, ISzAlloc *allocMain,
Byte *tempBuf[])
{
UInt32 ci;
SizeT tempSizes[3] = { 0, 0, 0};
SizeT tempSize3 = 0;
Byte *tempBuf3 = 0;
RINOK(CheckSupportedFolder(folder));
for (ci = 0; ci < folder->NumCoders; ci++)
{
CSzCoderInfo *coder = &folder->Coders[ci];
if (coder->MethodID == k_Copy || coder->MethodID == k_LZMA)
{
UInt32 si = 0;
UInt64 offset;
UInt64 inSize;
Byte *outBufCur = outBuffer;
SizeT outSizeCur = outSize;
if (folder->NumCoders == 4)
{
UInt32 indices[] = { 3, 2, 0 };
UInt64 unpackSize = folder->UnpackSizes[ci];
si = indices[ci];
if (ci < 2)
{
Byte *temp;
outSizeCur = (SizeT)unpackSize;
if (outSizeCur != unpackSize)
return SZ_ERROR_MEM;
temp = (Byte *)IAlloc_Alloc(allocMain, outSizeCur);
if (temp == 0 && outSizeCur != 0)
return SZ_ERROR_MEM;
outBufCur = tempBuf[1 - ci] = temp;
tempSizes[1 - ci] = outSizeCur;
}
else if (ci == 2)
{
if (unpackSize > outSize) /* check it */
return SZ_ERROR_PARAM;
tempBuf3 = outBufCur = outBuffer + (outSize - (size_t)unpackSize);
tempSize3 = outSizeCur = (SizeT)unpackSize;
}
else
return SZ_ERROR_UNSUPPORTED;
}
offset = GetSum(packSizes, si);
inSize = packSizes[si];
RINOK(LookInStream_SeekTo(inStream, startPos + offset));
if (coder->MethodID == k_Copy)
{
if (inSize != outSizeCur) /* check it */
return SZ_ERROR_DATA;
RINOK(SzDecodeCopy(inSize, inStream, outBufCur));
}
else
{
RINOK(SzDecodeLzma(coder, inSize, inStream, outBufCur, outSizeCur, allocMain));
}
}
else if (coder->MethodID == k_BCJ)
{
UInt32 state;
if (ci != 1)
return SZ_ERROR_UNSUPPORTED;
x86_Convert_Init(state);
x86_Convert(outBuffer, outSize, 0, &state, 0);
}
else if (coder->MethodID == k_BCJ2)
{
UInt64 offset = GetSum(packSizes, 1);
UInt64 s3Size = packSizes[1];
SRes res;
if (ci != 3)
return SZ_ERROR_UNSUPPORTED;
RINOK(LookInStream_SeekTo(inStream, startPos + offset));
tempSizes[2] = (SizeT)s3Size;
if (tempSizes[2] != s3Size)
return SZ_ERROR_MEM;
tempBuf[2] = (Byte *)IAlloc_Alloc(allocMain, tempSizes[2]);
if (tempBuf[2] == 0 && tempSizes[2] != 0)
return SZ_ERROR_MEM;
res = SzDecodeCopy(s3Size, inStream, tempBuf[2]);
RINOK(res)
res = Bcj2_Decode(
tempBuf3, tempSize3,
tempBuf[0], tempSizes[0],
tempBuf[1], tempSizes[1],
tempBuf[2], tempSizes[2],
outBuffer, outSize);
RINOK(res)
}
else
return SZ_ERROR_UNSUPPORTED;
}
return SZ_OK;
}
SRes SzDecode(const UInt64 *packSizes, const CSzFolder *folder,
ILookInStream *inStream, UInt64 startPos,
Byte *outBuffer, size_t outSize, ISzAlloc *allocMain)
{
Byte *tempBuf[3] = { 0, 0, 0};
int i;
SRes res = SzDecode2(packSizes, folder, inStream, startPos,
outBuffer, (SizeT)outSize, allocMain, tempBuf);
for (i = 0; i < 3; i++)
IAlloc_Free(allocMain, tempBuf[i]);
return res;
}

View File

@@ -1,13 +0,0 @@
/* 7zDecode.h -- Decoding from 7z folder
2008-11-23 : Igor Pavlov : Public domain */
#ifndef __7Z_DECODE_H
#define __7Z_DECODE_H
#include "7zItem.h"
SRes SzDecode(const UInt64 *packSizes, const CSzFolder *folder,
ILookInStream *stream, UInt64 startPos,
Byte *outBuffer, size_t outSize, ISzAlloc *allocMain);
#endif

View File

@@ -1,93 +0,0 @@
/* 7zExtract.c -- Extracting from 7z archive
2008-11-23 : Igor Pavlov : Public domain */
#include "../../7zCrc.h"
#include "7zDecode.h"
#include "7zExtract.h"
SRes SzAr_Extract(
const CSzArEx *p,
ILookInStream *inStream,
UInt32 fileIndex,
UInt32 *blockIndex,
Byte **outBuffer,
size_t *outBufferSize,
size_t *offset,
size_t *outSizeProcessed,
ISzAlloc *allocMain,
ISzAlloc *allocTemp)
{
UInt32 folderIndex = p->FileIndexToFolderIndexMap[fileIndex];
SRes res = SZ_OK;
*offset = 0;
*outSizeProcessed = 0;
if (folderIndex == (UInt32)-1)
{
IAlloc_Free(allocMain, *outBuffer);
*blockIndex = folderIndex;
*outBuffer = 0;
*outBufferSize = 0;
return SZ_OK;
}
if (*outBuffer == 0 || *blockIndex != folderIndex)
{
CSzFolder *folder = p->db.Folders + folderIndex;
UInt64 unpackSizeSpec = SzFolder_GetUnpackSize(folder);
size_t unpackSize = (size_t)unpackSizeSpec;
UInt64 startOffset = SzArEx_GetFolderStreamPos(p, folderIndex, 0);
if (unpackSize != unpackSizeSpec)
return SZ_ERROR_MEM;
*blockIndex = folderIndex;
IAlloc_Free(allocMain, *outBuffer);
*outBuffer = 0;
RINOK(LookInStream_SeekTo(inStream, startOffset));
if (res == SZ_OK)
{
*outBufferSize = unpackSize;
if (unpackSize != 0)
{
*outBuffer = (Byte *)IAlloc_Alloc(allocMain, unpackSize);
if (*outBuffer == 0)
res = SZ_ERROR_MEM;
}
if (res == SZ_OK)
{
res = SzDecode(p->db.PackSizes +
p->FolderStartPackStreamIndex[folderIndex], folder,
inStream, startOffset,
*outBuffer, unpackSize, allocTemp);
if (res == SZ_OK)
{
if (folder->UnpackCRCDefined)
{
if (CrcCalc(*outBuffer, unpackSize) != folder->UnpackCRC)
res = SZ_ERROR_CRC;
}
}
}
}
}
if (res == SZ_OK)
{
UInt32 i;
CSzFileItem *fileItem = p->db.Files + fileIndex;
*offset = 0;
for (i = p->FolderStartFileIndex[folderIndex]; i < fileIndex; i++)
*offset += (UInt32)p->db.Files[i].Size;
*outSizeProcessed = (size_t)fileItem->Size;
if (*offset + *outSizeProcessed > *outBufferSize)
return SZ_ERROR_FAIL;
{
if (fileItem->FileCRCDefined)
{
if (CrcCalc(*outBuffer + *offset, *outSizeProcessed) != fileItem->FileCRC)
res = SZ_ERROR_CRC;
}
}
}
return res;
}

View File

@@ -1,41 +0,0 @@
/* 7zExtract.h -- Extracting from 7z archive
2008-11-23 : Igor Pavlov : Public domain */
#ifndef __7Z_EXTRACT_H
#define __7Z_EXTRACT_H
#include "7zIn.h"
/*
SzExtract extracts file from archive
*outBuffer must be 0 before first call for each new archive.
Extracting cache:
If you need to decompress more than one file, you can send
these values from previous call:
*blockIndex,
*outBuffer,
*outBufferSize
You can consider "*outBuffer" as cache of solid block. If your archive is solid,
it will increase decompression speed.
If you use external function, you can declare these 3 cache variables
(blockIndex, outBuffer, outBufferSize) as static in that external function.
Free *outBuffer and set *outBuffer to 0, if you want to flush cache.
*/
SRes SzAr_Extract(
const CSzArEx *db,
ILookInStream *inStream,
UInt32 fileIndex, /* index of file */
UInt32 *blockIndex, /* index of solid block */
Byte **outBuffer, /* pointer to pointer to output buffer (allocated with allocMain) */
size_t *outBufferSize, /* buffer size for output buffer */
size_t *offset, /* offset of stream for required file in *outBuffer */
size_t *outSizeProcessed, /* size of file in *outBuffer */
ISzAlloc *allocMain,
ISzAlloc *allocTemp);
#endif

View File

@@ -1,6 +0,0 @@
/* 7zHeader.c -- 7z Headers
2008-10-04 : Igor Pavlov : Public domain */
#include "7zHeader.h"
Byte k7zSignature[k7zSignatureSize] = {'7', 'z', 0xBC, 0xAF, 0x27, 0x1C};

View File

@@ -1,57 +0,0 @@
/* 7zHeader.h -- 7z Headers
2008-10-04 : Igor Pavlov : Public domain */
#ifndef __7Z_HEADER_H
#define __7Z_HEADER_H
#include "../../Types.h"
#define k7zSignatureSize 6
extern Byte k7zSignature[k7zSignatureSize];
#define k7zMajorVersion 0
#define k7zStartHeaderSize 0x20
enum EIdEnum
{
k7zIdEnd,
k7zIdHeader,
k7zIdArchiveProperties,
k7zIdAdditionalStreamsInfo,
k7zIdMainStreamsInfo,
k7zIdFilesInfo,
k7zIdPackInfo,
k7zIdUnpackInfo,
k7zIdSubStreamsInfo,
k7zIdSize,
k7zIdCRC,
k7zIdFolder,
k7zIdCodersUnpackSize,
k7zIdNumUnpackStream,
k7zIdEmptyStream,
k7zIdEmptyFile,
k7zIdAnti,
k7zIdName,
k7zIdCTime,
k7zIdATime,
k7zIdMTime,
k7zIdWinAttributes,
k7zIdComment,
k7zIdEncodedHeader,
k7zIdStartPos,
k7zIdDummy
};
#endif

View File

File diff suppressed because it is too large Load Diff

View File

@@ -1,41 +0,0 @@
/* 7zIn.h -- 7z Input functions
2008-11-23 : Igor Pavlov : Public domain */
#ifndef __7Z_IN_H
#define __7Z_IN_H
#include "7zHeader.h"
#include "7zItem.h"
typedef struct
{
CSzAr db;
UInt64 startPosAfterHeader;
UInt64 dataPos;
UInt32 *FolderStartPackStreamIndex;
UInt64 *PackStreamStartPositions;
UInt32 *FolderStartFileIndex;
UInt32 *FileIndexToFolderIndexMap;
} CSzArEx;
void SzArEx_Init(CSzArEx *p);
void SzArEx_Free(CSzArEx *p, ISzAlloc *alloc);
UInt64 SzArEx_GetFolderStreamPos(const CSzArEx *p, UInt32 folderIndex, UInt32 indexInFolder);
int SzArEx_GetFolderFullPackSize(const CSzArEx *p, UInt32 folderIndex, UInt64 *resSize);
/*
Errors:
SZ_ERROR_NO_ARCHIVE
SZ_ERROR_ARCHIVE
SZ_ERROR_UNSUPPORTED
SZ_ERROR_MEM
SZ_ERROR_CRC
SZ_ERROR_INPUT_EOF
SZ_ERROR_FAIL
*/
SRes SzArEx_Open(CSzArEx *p, ILookInStream *inStream, ISzAlloc *allocMain, ISzAlloc *allocTemp);
#endif

View File

@@ -1,127 +0,0 @@
/* 7zItem.c -- 7z Items
2008-10-04 : Igor Pavlov : Public domain */
#include "7zItem.h"
void SzCoderInfo_Init(CSzCoderInfo *p)
{
Buf_Init(&p->Props);
}
void SzCoderInfo_Free(CSzCoderInfo *p, ISzAlloc *alloc)
{
Buf_Free(&p->Props, alloc);
SzCoderInfo_Init(p);
}
void SzFolder_Init(CSzFolder *p)
{
p->Coders = 0;
p->BindPairs = 0;
p->PackStreams = 0;
p->UnpackSizes = 0;
p->NumCoders = 0;
p->NumBindPairs = 0;
p->NumPackStreams = 0;
p->UnpackCRCDefined = 0;
p->UnpackCRC = 0;
p->NumUnpackStreams = 0;
}
void SzFolder_Free(CSzFolder *p, ISzAlloc *alloc)
{
UInt32 i;
if (p->Coders)
for (i = 0; i < p->NumCoders; i++)
SzCoderInfo_Free(&p->Coders[i], alloc);
IAlloc_Free(alloc, p->Coders);
IAlloc_Free(alloc, p->BindPairs);
IAlloc_Free(alloc, p->PackStreams);
IAlloc_Free(alloc, p->UnpackSizes);
SzFolder_Init(p);
}
UInt32 SzFolder_GetNumOutStreams(CSzFolder *p)
{
UInt32 result = 0;
UInt32 i;
for (i = 0; i < p->NumCoders; i++)
result += p->Coders[i].NumOutStreams;
return result;
}
int SzFolder_FindBindPairForInStream(CSzFolder *p, UInt32 inStreamIndex)
{
UInt32 i;
for (i = 0; i < p->NumBindPairs; i++)
if (p->BindPairs[i].InIndex == inStreamIndex)
return i;
return -1;
}
int SzFolder_FindBindPairForOutStream(CSzFolder *p, UInt32 outStreamIndex)
{
UInt32 i;
for (i = 0; i < p->NumBindPairs; i++)
if (p->BindPairs[i].OutIndex == outStreamIndex)
return i;
return -1;
}
UInt64 SzFolder_GetUnpackSize(CSzFolder *p)
{
int i = (int)SzFolder_GetNumOutStreams(p);
if (i == 0)
return 0;
for (i--; i >= 0; i--)
if (SzFolder_FindBindPairForOutStream(p, i) < 0)
return p->UnpackSizes[i];
/* throw 1; */
return 0;
}
void SzFile_Init(CSzFileItem *p)
{
p->HasStream = 1;
p->IsDir = 0;
p->IsAnti = 0;
p->FileCRCDefined = 0;
p->MTimeDefined = 0;
p->Name = 0;
}
static void SzFile_Free(CSzFileItem *p, ISzAlloc *alloc)
{
IAlloc_Free(alloc, p->Name);
SzFile_Init(p);
}
void SzAr_Init(CSzAr *p)
{
p->PackSizes = 0;
p->PackCRCsDefined = 0;
p->PackCRCs = 0;
p->Folders = 0;
p->Files = 0;
p->NumPackStreams = 0;
p->NumFolders = 0;
p->NumFiles = 0;
}
void SzAr_Free(CSzAr *p, ISzAlloc *alloc)
{
UInt32 i;
if (p->Folders)
for (i = 0; i < p->NumFolders; i++)
SzFolder_Free(&p->Folders[i], alloc);
if (p->Files)
for (i = 0; i < p->NumFiles; i++)
SzFile_Free(&p->Files[i], alloc);
IAlloc_Free(alloc, p->PackSizes);
IAlloc_Free(alloc, p->PackCRCsDefined);
IAlloc_Free(alloc, p->PackCRCs);
IAlloc_Free(alloc, p->Folders);
IAlloc_Free(alloc, p->Files);
SzAr_Init(p);
}

View File

@@ -1,84 +0,0 @@
/* 7zItem.h -- 7z Items
2008-10-04 : Igor Pavlov : Public domain */
#ifndef __7Z_ITEM_H
#define __7Z_ITEM_H
#include "../../7zBuf.h"
typedef struct
{
UInt32 NumInStreams;
UInt32 NumOutStreams;
UInt64 MethodID;
CBuf Props;
} CSzCoderInfo;
void SzCoderInfo_Init(CSzCoderInfo *p);
void SzCoderInfo_Free(CSzCoderInfo *p, ISzAlloc *alloc);
typedef struct
{
UInt32 InIndex;
UInt32 OutIndex;
} CBindPair;
typedef struct
{
CSzCoderInfo *Coders;
CBindPair *BindPairs;
UInt32 *PackStreams;
UInt64 *UnpackSizes;
UInt32 NumCoders;
UInt32 NumBindPairs;
UInt32 NumPackStreams;
int UnpackCRCDefined;
UInt32 UnpackCRC;
UInt32 NumUnpackStreams;
} CSzFolder;
void SzFolder_Init(CSzFolder *p);
UInt64 SzFolder_GetUnpackSize(CSzFolder *p);
int SzFolder_FindBindPairForInStream(CSzFolder *p, UInt32 inStreamIndex);
UInt32 SzFolder_GetNumOutStreams(CSzFolder *p);
UInt64 SzFolder_GetUnpackSize(CSzFolder *p);
typedef struct
{
UInt32 Low;
UInt32 High;
} CNtfsFileTime;
typedef struct
{
CNtfsFileTime MTime;
UInt64 Size;
char *Name;
UInt32 FileCRC;
Byte HasStream;
Byte IsDir;
Byte IsAnti;
Byte FileCRCDefined;
Byte MTimeDefined;
} CSzFileItem;
void SzFile_Init(CSzFileItem *p);
typedef struct
{
UInt64 *PackSizes;
Byte *PackCRCsDefined;
UInt32 *PackCRCs;
CSzFolder *Folders;
CSzFileItem *Files;
UInt32 NumPackStreams;
UInt32 NumFolders;
UInt32 NumFiles;
} CSzAr;
void SzAr_Init(CSzAr *p);
void SzAr_Free(CSzAr *p, ISzAlloc *alloc);
#endif

View File

@@ -1,262 +0,0 @@
/* 7zMain.c - Test application for 7z Decoder
2008-11-23 : Igor Pavlov : Public domain */
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "../../7zCrc.h"
#include "../../7zFile.h"
#include "../../7zVersion.h"
#include "7zAlloc.h"
#include "7zExtract.h"
#include "7zIn.h"
static void ConvertNumberToString(UInt64 value, char *s)
{
char temp[32];
int pos = 0;
do
{
temp[pos++] = (char)('0' + (int)(value % 10));
value /= 10;
}
while (value != 0);
do
*s++ = temp[--pos];
while (pos > 0);
*s = '\0';
}
#define PERIOD_4 (4 * 365 + 1)
#define PERIOD_100 (PERIOD_4 * 25 - 1)
#define PERIOD_400 (PERIOD_100 * 4 + 1)
static void ConvertFileTimeToString(CNtfsFileTime *ft, char *s)
{
unsigned year, mon, day, hour, min, sec;
UInt64 v64 = ft->Low | ((UInt64)ft->High << 32);
Byte ms[] = { 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31 };
unsigned temp;
UInt32 v;
v64 /= 10000000;
sec = (unsigned)(v64 % 60);
v64 /= 60;
min = (unsigned)(v64 % 60);
v64 /= 60;
hour = (unsigned)(v64 % 24);
v64 /= 24;
v = (UInt32)v64;
year = (unsigned)(1601 + v / PERIOD_400 * 400);
v %= PERIOD_400;
temp = (unsigned)(v / PERIOD_100);
if (temp == 4)
temp = 3;
year += temp * 100;
v -= temp * PERIOD_100;
temp = v / PERIOD_4;
if (temp == 25)
temp = 24;
year += temp * 4;
v -= temp * PERIOD_4;
temp = v / 365;
if (temp == 4)
temp = 3;
year += temp;
v -= temp * 365;
if (year % 4 == 0 && (year % 100 != 0 || year % 400 == 0))
ms[1] = 29;
for (mon = 1; mon <= 12; mon++)
{
unsigned s = ms[mon - 1];
if (v < s)
break;
v -= s;
}
day = (unsigned)v + 1;
sprintf(s, "%04d-%02d-%02d %02d:%02d:%02d", year, mon, day, hour, min, sec);
}
void PrintError(char *sz)
{
printf("\nERROR: %s\n", sz);
}
int MY_CDECL main(int numargs, char *args[])
{
CFileInStream archiveStream;
CLookToRead lookStream;
CSzArEx db;
SRes res;
ISzAlloc allocImp;
ISzAlloc allocTempImp;
printf("\n7z ANSI-C Decoder " MY_VERSION_COPYRIGHT_DATE "\n");
if (numargs == 1)
{
printf(
"\nUsage: 7zDec <command> <archive_name>\n\n"
"<Commands>\n"
" e: Extract files from archive\n"
" l: List contents of archive\n"
" t: Test integrity of archive\n");
return 0;
}
if (numargs < 3)
{
PrintError("incorrect command");
return 1;
}
if (InFile_Open(&archiveStream.file, args[2]))
{
PrintError("can not open input file");
return 1;
}
FileInStream_CreateVTable(&archiveStream);
LookToRead_CreateVTable(&lookStream, False);
lookStream.realStream = &archiveStream.s;
LookToRead_Init(&lookStream);
allocImp.Alloc = SzAlloc;
allocImp.Free = SzFree;
allocTempImp.Alloc = SzAllocTemp;
allocTempImp.Free = SzFreeTemp;
CrcGenerateTable();
SzArEx_Init(&db);
res = SzArEx_Open(&db, &lookStream.s, &allocImp, &allocTempImp);
if (res == SZ_OK)
{
char *command = args[1];
int listCommand = 0, testCommand = 0, extractCommand = 0;
if (strcmp(command, "l") == 0) listCommand = 1;
else if (strcmp(command, "t") == 0) testCommand = 1;
else if (strcmp(command, "e") == 0) extractCommand = 1;
if (listCommand)
{
UInt32 i;
for (i = 0; i < db.db.NumFiles; i++)
{
CSzFileItem *f = db.db.Files + i;
char s[32], t[32];
ConvertNumberToString(f->Size, s);
if (f->MTimeDefined)
ConvertFileTimeToString(&f->MTime, t);
else
strcpy(t, " ");
printf("%s %10s %s\n", t, s, f->Name);
}
}
else if (testCommand || extractCommand)
{
UInt32 i;
/*
if you need cache, use these 3 variables.
if you use external function, you can make these variable as static.
*/
UInt32 blockIndex = 0xFFFFFFFF; /* it can have any value before first call (if outBuffer = 0) */
Byte *outBuffer = 0; /* it must be 0 before first call for each new archive. */
size_t outBufferSize = 0; /* it can have any value before first call (if outBuffer = 0) */
printf("\n");
for (i = 0; i < db.db.NumFiles; i++)
{
size_t offset;
size_t outSizeProcessed;
CSzFileItem *f = db.db.Files + i;
if (f->IsDir)
printf("Directory ");
else
printf(testCommand ?
"Testing ":
"Extracting");
printf(" %s", f->Name);
if (f->IsDir)
{
printf("\n");
continue;
}
res = SzAr_Extract(&db, &lookStream.s, i,
&blockIndex, &outBuffer, &outBufferSize,
&offset, &outSizeProcessed,
&allocImp, &allocTempImp);
if (res != SZ_OK)
break;
if (!testCommand)
{
CSzFile outFile;
size_t processedSize;
char *fileName = f->Name;
size_t nameLen = strlen(f->Name);
for (; nameLen > 0; nameLen--)
if (f->Name[nameLen - 1] == '/')
{
fileName = f->Name + nameLen;
break;
}
if (OutFile_Open(&outFile, fileName))
{
PrintError("can not open output file");
res = SZ_ERROR_FAIL;
break;
}
processedSize = outSizeProcessed;
if (File_Write(&outFile, outBuffer + offset, &processedSize) != 0 ||
processedSize != outSizeProcessed)
{
PrintError("can not write output file");
res = SZ_ERROR_FAIL;
break;
}
if (File_Close(&outFile))
{
PrintError("can not close output file");
res = SZ_ERROR_FAIL;
break;
}
}
printf("\n");
}
IAlloc_Free(&allocImp, outBuffer);
}
else
{
PrintError("incorrect command");
res = SZ_ERROR_FAIL;
}
}
SzArEx_Free(&db, &allocImp);
File_Close(&archiveStream.file);
if (res == SZ_OK)
{
printf("\nEverything is Ok\n");
return 0;
}
if (res == SZ_ERROR_UNSUPPORTED)
PrintError("decoder doesn't support this archive");
else if (res == SZ_ERROR_MEM)
PrintError("can not allocate memory");
else if (res == SZ_ERROR_CRC)
PrintError("CRC error");
else
printf("\nERROR #%d\n", res);
return 1;
}

View File

@@ -1,61 +0,0 @@
PROG = 7zDec
CXX = g++
LIB =
RM = rm -f
CFLAGS = -c -O2 -Wall
OBJS = 7zAlloc.o 7zBuf.o 7zBuf2.o 7zCrc.o 7zDecode.o 7zExtract.o 7zHeader.o 7zIn.o 7zItem.o 7zMain.o LzmaDec.o Bra86.o Bcj2.o 7zFile.o 7zStream.o
all: $(PROG)
$(PROG): $(OBJS)
$(CXX) -o $(PROG) $(LDFLAGS) $(OBJS) $(LIB)
7zAlloc.o: 7zAlloc.c
$(CXX) $(CFLAGS) 7zAlloc.c
7zBuf.o: ../../7zBuf.c
$(CXX) $(CFLAGS) ../../7zBuf.c
7zBuf2.o: ../../7zBuf2.c
$(CXX) $(CFLAGS) ../../7zBuf2.c
7zCrc.o: ../../7zCrc.c
$(CXX) $(CFLAGS) ../../7zCrc.c
7zDecode.o: 7zDecode.c
$(CXX) $(CFLAGS) 7zDecode.c
7zExtract.o: 7zExtract.c
$(CXX) $(CFLAGS) 7zExtract.c
7zHeader.o: 7zHeader.c
$(CXX) $(CFLAGS) 7zHeader.c
7zIn.o: 7zIn.c
$(CXX) $(CFLAGS) 7zIn.c
7zItem.o: 7zItem.c
$(CXX) $(CFLAGS) 7zItem.c
7zMain.o: 7zMain.c
$(CXX) $(CFLAGS) 7zMain.c
LzmaDec.o: ../../LzmaDec.c
$(CXX) $(CFLAGS) ../../LzmaDec.c
Bra86.o: ../../Bra86.c
$(CXX) $(CFLAGS) ../../Bra86.c
Bcj2.o: ../../Bcj2.c
$(CXX) $(CFLAGS) ../../Bcj2.c
7zFile.o: ../../7zFile.c
$(CXX) $(CFLAGS) ../../7zFile.c
7zStream.o: ../../7zStream.c
$(CXX) $(CFLAGS) ../../7zStream.c
clean:
-$(RM) $(PROG) $(OBJS)

384
C/Bcj2.c
View File

@@ -1,132 +1,290 @@
/* Bcj2.c -- Converter for x86 code (BCJ2)
2008-10-04 : Igor Pavlov : Public domain */
/* Bcj2.c -- BCJ2 Decoder (Converter for x86 code)
2023-03-01 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include "Bcj2.h"
#include "CpuArch.h"
#ifdef _LZMA_PROB32
#define CProb UInt32
#else
#define CProb UInt16
#endif
#define IsJcc(b0, b1) ((b0) == 0x0F && ((b1) & 0xF0) == 0x80)
#define IsJ(b0, b1) ((b1 & 0xFE) == 0xE8 || IsJcc(b0, b1))
#define kNumTopBits 24
#define kTopValue ((UInt32)1 << kNumTopBits)
#define kTopValue ((UInt32)1 << 24)
#define kNumBitModelTotalBits 11
#define kBitModelTotal (1 << kNumBitModelTotalBits)
#define kNumMoveBits 5
#define RC_READ_BYTE (*buffer++)
#define RC_TEST { if (buffer == bufferLim) return SZ_ERROR_DATA; }
#define RC_INIT2 code = 0; range = 0xFFFFFFFF; \
{ int i; for (i = 0; i < 5; i++) { RC_TEST; code = (code << 8) | RC_READ_BYTE; }}
// UInt32 bcj2_stats[256 + 2][2];
#define NORMALIZE if (range < kTopValue) { RC_TEST; range <<= 8; code = (code << 8) | RC_READ_BYTE; }
#define IF_BIT_0(p) ttt = *(p); bound = (range >> kNumBitModelTotalBits) * ttt; if (code < bound)
#define UPDATE_0(p) range = bound; *(p) = (CProb)(ttt + ((kBitModelTotal - ttt) >> kNumMoveBits)); NORMALIZE;
#define UPDATE_1(p) range -= bound; code -= bound; *(p) = (CProb)(ttt - (ttt >> kNumMoveBits)); NORMALIZE;
int Bcj2_Decode(
const Byte *buf0, SizeT size0,
const Byte *buf1, SizeT size1,
const Byte *buf2, SizeT size2,
const Byte *buf3, SizeT size3,
Byte *outBuf, SizeT outSize)
void Bcj2Dec_Init(CBcj2Dec *p)
{
CProb p[256 + 2];
SizeT inPos = 0, outPos = 0;
unsigned i;
p->state = BCJ2_STREAM_RC; // BCJ2_DEC_STATE_OK;
p->ip = 0;
p->temp = 0;
p->range = 0;
p->code = 0;
for (i = 0; i < sizeof(p->probs) / sizeof(p->probs[0]); i++)
p->probs[i] = kBitModelTotal >> 1;
}
const Byte *buffer, *bufferLim;
UInt32 range, code;
Byte prevByte = 0;
unsigned int i;
for (i = 0; i < sizeof(p) / sizeof(p[0]); i++)
p[i] = kBitModelTotal >> 1;
buffer = buf3;
bufferLim = buffer + size3;
RC_INIT2
if (outSize == 0)
return SZ_OK;
for (;;)
SRes Bcj2Dec_Decode(CBcj2Dec *p)
{
UInt32 v = p->temp;
// const Byte *src;
if (p->range <= 5)
{
Byte b;
CProb *prob;
UInt32 bound;
UInt32 ttt;
SizeT limit = size0 - inPos;
if (outSize - outPos < limit)
limit = outSize - outPos;
while (limit != 0)
UInt32 code = p->code;
p->state = BCJ2_DEC_STATE_ERROR; /* for case if we return SZ_ERROR_DATA; */
for (; p->range != 5; p->range++)
{
Byte b = buf0[inPos];
outBuf[outPos++] = b;
if (IsJ(prevByte, b))
break;
inPos++;
prevByte = b;
limit--;
}
if (limit == 0 || outPos == outSize)
break;
b = buf0[inPos++];
if (b == 0xE8)
prob = p + prevByte;
else if (b == 0xE9)
prob = p + 256;
else
prob = p + 257;
IF_BIT_0(prob)
{
UPDATE_0(prob)
prevByte = b;
}
else
{
UInt32 dest;
const Byte *v;
UPDATE_1(prob)
if (b == 0xE8)
if (p->range == 1 && code != 0)
return SZ_ERROR_DATA;
if (p->bufs[BCJ2_STREAM_RC] == p->lims[BCJ2_STREAM_RC])
{
v = buf1;
if (size1 < 4)
return SZ_ERROR_DATA;
buf1 += 4;
size1 -= 4;
p->state = BCJ2_STREAM_RC;
return SZ_OK;
}
else
code = (code << 8) | *(p->bufs[BCJ2_STREAM_RC])++;
p->code = code;
}
if (code == 0xffffffff)
return SZ_ERROR_DATA;
p->range = 0xffffffff;
}
// else
{
unsigned state = p->state;
// we check BCJ2_IS_32BIT_STREAM() here instead of check in the main loop
if (BCJ2_IS_32BIT_STREAM(state))
{
const Byte *cur = p->bufs[state];
if (cur == p->lims[state])
return SZ_OK;
p->bufs[state] = cur + 4;
{
v = buf2;
if (size2 < 4)
return SZ_ERROR_DATA;
buf2 += 4;
size2 -= 4;
const UInt32 ip = p->ip + 4;
v = GetBe32a(cur) - ip;
p->ip = ip;
}
state = BCJ2_DEC_STATE_ORIG_0;
}
if ((unsigned)(state - BCJ2_DEC_STATE_ORIG_0) < 4)
{
Byte *dest = p->dest;
for (;;)
{
if (dest == p->destLim)
{
p->state = state;
p->temp = v;
return SZ_OK;
}
*dest++ = (Byte)v;
p->dest = dest;
if (++state == BCJ2_DEC_STATE_ORIG_3 + 1)
break;
v >>= 8;
}
dest = (((UInt32)v[0] << 24) | ((UInt32)v[1] << 16) |
((UInt32)v[2] << 8) | ((UInt32)v[3])) - ((UInt32)outPos + 4);
outBuf[outPos++] = (Byte)dest;
if (outPos == outSize)
break;
outBuf[outPos++] = (Byte)(dest >> 8);
if (outPos == outSize)
break;
outBuf[outPos++] = (Byte)(dest >> 16);
if (outPos == outSize)
break;
outBuf[outPos++] = prevByte = (Byte)(dest >> 24);
}
}
return (outPos == outSize) ? SZ_OK : SZ_ERROR_DATA;
// src = p->bufs[BCJ2_STREAM_MAIN];
for (;;)
{
/*
if (BCJ2_IS_32BIT_STREAM(p->state))
p->state = BCJ2_DEC_STATE_OK;
else
*/
{
if (p->range < kTopValue)
{
if (p->bufs[BCJ2_STREAM_RC] == p->lims[BCJ2_STREAM_RC])
{
p->state = BCJ2_STREAM_RC;
p->temp = v;
return SZ_OK;
}
p->range <<= 8;
p->code = (p->code << 8) | *(p->bufs[BCJ2_STREAM_RC])++;
}
{
const Byte *src = p->bufs[BCJ2_STREAM_MAIN];
const Byte *srcLim;
Byte *dest = p->dest;
{
const SizeT rem = (SizeT)(p->lims[BCJ2_STREAM_MAIN] - src);
SizeT num = (SizeT)(p->destLim - dest);
if (num >= rem)
num = rem;
#define NUM_ITERS 4
#if (NUM_ITERS & (NUM_ITERS - 1)) == 0
num &= ~((SizeT)NUM_ITERS - 1); // if (NUM_ITERS == (1 << x))
#else
num -= num % NUM_ITERS; // if (NUM_ITERS != (1 << x))
#endif
srcLim = src + num;
}
#define NUM_SHIFT_BITS 24
#define ONE_ITER(indx) { \
const unsigned b = src[indx]; \
*dest++ = (Byte)b; \
v = (v << NUM_SHIFT_BITS) | b; \
if (((b + (0x100 - 0xe8)) & 0xfe) == 0) break; \
if (((v - (((UInt32)0x0f << (NUM_SHIFT_BITS)) + 0x80)) & \
((((UInt32)1 << (4 + NUM_SHIFT_BITS)) - 0x1) << 4)) == 0) break; \
/* ++dest */; /* v = b; */ }
if (src != srcLim)
for (;;)
{
/* The dependency chain of 2-cycle for (v) calculation is not big problem here.
But we can remove dependency chain with v = b in the end of loop. */
ONE_ITER(0)
#if (NUM_ITERS > 1)
ONE_ITER(1)
#if (NUM_ITERS > 2)
ONE_ITER(2)
#if (NUM_ITERS > 3)
ONE_ITER(3)
#if (NUM_ITERS > 4)
ONE_ITER(4)
#if (NUM_ITERS > 5)
ONE_ITER(5)
#if (NUM_ITERS > 6)
ONE_ITER(6)
#if (NUM_ITERS > 7)
ONE_ITER(7)
#endif
#endif
#endif
#endif
#endif
#endif
#endif
src += NUM_ITERS;
if (src == srcLim)
break;
}
if (src == srcLim)
#if (NUM_ITERS > 1)
for (;;)
#endif
{
#if (NUM_ITERS > 1)
if (src == p->lims[BCJ2_STREAM_MAIN] || dest == p->destLim)
#endif
{
const SizeT num = (SizeT)(src - p->bufs[BCJ2_STREAM_MAIN]);
p->bufs[BCJ2_STREAM_MAIN] = src;
p->dest = dest;
p->ip += (UInt32)num;
/* state BCJ2_STREAM_MAIN has more priority than BCJ2_STATE_ORIG */
p->state =
src == p->lims[BCJ2_STREAM_MAIN] ?
(unsigned)BCJ2_STREAM_MAIN :
(unsigned)BCJ2_DEC_STATE_ORIG;
p->temp = v;
return SZ_OK;
}
#if (NUM_ITERS > 1)
ONE_ITER(0)
src++;
#endif
}
{
const SizeT num = (SizeT)(dest - p->dest);
p->dest = dest; // p->dest += num;
p->bufs[BCJ2_STREAM_MAIN] += num; // = src;
p->ip += (UInt32)num;
}
{
UInt32 bound, ttt;
CBcj2Prob *prob; // unsigned index;
/*
prob = p->probs + (unsigned)((Byte)v == 0xe8 ?
2 + (Byte)(v >> 8) :
((v >> 5) & 1)); // ((Byte)v < 0xe8 ? 0 : 1));
*/
{
const unsigned c = ((v + 0x17) >> 6) & 1;
prob = p->probs + (unsigned)
(((0 - c) & (Byte)(v >> NUM_SHIFT_BITS)) + c + ((v >> 5) & 1));
// (Byte)
// 8x->0 : e9->1 : xxe8->xx+2
// 8x->0x100 : e9->0x101 : xxe8->xx
// (((0x100 - (e & ~v)) & (0x100 | (v >> 8))) + (e & v));
// (((0x101 + (~e | v)) & (0x100 | (v >> 8))) + (e & v));
}
ttt = *prob;
bound = (p->range >> kNumBitModelTotalBits) * ttt;
if (p->code < bound)
{
// bcj2_stats[prob - p->probs][0]++;
p->range = bound;
*prob = (CBcj2Prob)(ttt + ((kBitModelTotal - ttt) >> kNumMoveBits));
continue;
}
{
// bcj2_stats[prob - p->probs][1]++;
p->range -= bound;
p->code -= bound;
*prob = (CBcj2Prob)(ttt - (ttt >> kNumMoveBits));
}
}
}
}
{
/* (v == 0xe8 ? 0 : 1) uses setcc instruction with additional zero register usage in x64 MSVC. */
// const unsigned cj = ((Byte)v == 0xe8) ? BCJ2_STREAM_CALL : BCJ2_STREAM_JUMP;
const unsigned cj = (((v + 0x57) >> 6) & 1) + BCJ2_STREAM_CALL;
const Byte *cur = p->bufs[cj];
Byte *dest;
SizeT rem;
if (cur == p->lims[cj])
{
p->state = cj;
break;
}
v = GetBe32a(cur);
p->bufs[cj] = cur + 4;
{
const UInt32 ip = p->ip + 4;
v -= ip;
p->ip = ip;
}
dest = p->dest;
rem = (SizeT)(p->destLim - dest);
if (rem < 4)
{
if ((unsigned)rem > 0) { dest[0] = (Byte)v; v >>= 8;
if ((unsigned)rem > 1) { dest[1] = (Byte)v; v >>= 8;
if ((unsigned)rem > 2) { dest[2] = (Byte)v; v >>= 8; }}}
p->temp = v;
p->dest = dest + rem;
p->state = BCJ2_DEC_STATE_ORIG_0 + (unsigned)rem;
break;
}
SetUi32(dest, v)
v >>= 24;
p->dest = dest + 4;
}
}
if (p->range < kTopValue && p->bufs[BCJ2_STREAM_RC] != p->lims[BCJ2_STREAM_RC])
{
p->range <<= 8;
p->code = (p->code << 8) | *(p->bufs[BCJ2_STREAM_RC])++;
}
return SZ_OK;
}
#undef NUM_ITERS
#undef ONE_ITER
#undef NUM_SHIFT_BITS
#undef kTopValue
#undef kNumBitModelTotalBits
#undef kBitModelTotal
#undef kNumMoveBits

346
C/Bcj2.h
View File

@@ -1,30 +1,332 @@
/* Bcj2.h -- Converter for x86 code (BCJ2)
2008-10-04 : Igor Pavlov : Public domain */
/* Bcj2.h -- BCJ2 converter for x86 code (Branch CALL/JUMP variant2)
2023-03-02 : Igor Pavlov : Public domain */
#ifndef __BCJ2_H
#define __BCJ2_H
#ifndef ZIP7_INC_BCJ2_H
#define ZIP7_INC_BCJ2_H
#include "Types.h"
#include "7zTypes.h"
EXTERN_C_BEGIN
#define BCJ2_NUM_STREAMS 4
enum
{
BCJ2_STREAM_MAIN,
BCJ2_STREAM_CALL,
BCJ2_STREAM_JUMP,
BCJ2_STREAM_RC
};
enum
{
BCJ2_DEC_STATE_ORIG_0 = BCJ2_NUM_STREAMS,
BCJ2_DEC_STATE_ORIG_1,
BCJ2_DEC_STATE_ORIG_2,
BCJ2_DEC_STATE_ORIG_3,
BCJ2_DEC_STATE_ORIG,
BCJ2_DEC_STATE_ERROR /* after detected data error */
};
enum
{
BCJ2_ENC_STATE_ORIG = BCJ2_NUM_STREAMS,
BCJ2_ENC_STATE_FINISHED /* it's state after fully encoded stream */
};
/* #define BCJ2_IS_32BIT_STREAM(s) ((s) == BCJ2_STREAM_CALL || (s) == BCJ2_STREAM_JUMP) */
#define BCJ2_IS_32BIT_STREAM(s) ((unsigned)((unsigned)(s) - (unsigned)BCJ2_STREAM_CALL) < 2)
/*
Conditions:
outSize <= FullOutputSize,
where FullOutputSize is full size of output stream of x86_2 filter.
If buf0 overlaps outBuf, there are two required conditions:
1) (buf0 >= outBuf)
2) (buf0 + size0 >= outBuf + FullOutputSize).
Returns:
SZ_OK
SZ_ERROR_DATA - Data error
CBcj2Dec / CBcj2Enc
bufs sizes:
BUF_SIZE(n) = lims[n] - bufs[n]
bufs sizes for BCJ2_STREAM_CALL and BCJ2_STREAM_JUMP must be multiply of 4:
(BUF_SIZE(BCJ2_STREAM_CALL) & 3) == 0
(BUF_SIZE(BCJ2_STREAM_JUMP) & 3) == 0
*/
int Bcj2_Decode(
const Byte *buf0, SizeT size0,
const Byte *buf1, SizeT size1,
const Byte *buf2, SizeT size2,
const Byte *buf3, SizeT size3,
Byte *outBuf, SizeT outSize);
// typedef UInt32 CBcj2Prob;
typedef UInt16 CBcj2Prob;
/*
BCJ2 encoder / decoder internal requirements:
- If last bytes of stream contain marker (e8/e8/0f8x), then
there is also encoded symbol (0 : no conversion) in RC stream.
- One case of overlapped instructions is supported,
if last byte of converted instruction is (0f) and next byte is (8x):
marker [xx xx xx 0f] 8x
then the pair (0f 8x) is treated as marker.
*/
/* ---------- BCJ2 Decoder ---------- */
/*
CBcj2Dec:
(dest) is allowed to overlap with bufs[BCJ2_STREAM_MAIN], with the following conditions:
bufs[BCJ2_STREAM_MAIN] >= dest &&
bufs[BCJ2_STREAM_MAIN] - dest >=
BUF_SIZE(BCJ2_STREAM_CALL) +
BUF_SIZE(BCJ2_STREAM_JUMP)
reserve = bufs[BCJ2_STREAM_MAIN] - dest -
( BUF_SIZE(BCJ2_STREAM_CALL) +
BUF_SIZE(BCJ2_STREAM_JUMP) )
and additional conditions:
if (it's first call of Bcj2Dec_Decode() after Bcj2Dec_Init())
{
(reserve != 1) : if (ver < v23.00)
}
else // if there are more than one calls of Bcj2Dec_Decode() after Bcj2Dec_Init())
{
(reserve >= 6) : if (ver < v23.00)
(reserve >= 4) : if (ver >= v23.00)
We need that (reserve) because after first call of Bcj2Dec_Decode(),
CBcj2Dec::temp can contain up to 4 bytes for writing to (dest).
}
(reserve == 0) is allowed, if we decode full stream via single call of Bcj2Dec_Decode().
(reserve == 0) also is allowed in case of multi-call, if we use fixed buffers,
and (reserve) is calculated from full (final) sizes of all streams before first call.
*/
typedef struct
{
const Byte *bufs[BCJ2_NUM_STREAMS];
const Byte *lims[BCJ2_NUM_STREAMS];
Byte *dest;
const Byte *destLim;
unsigned state; /* BCJ2_STREAM_MAIN has more priority than BCJ2_STATE_ORIG */
UInt32 ip; /* property of starting base for decoding */
UInt32 temp; /* Byte temp[4]; */
UInt32 range;
UInt32 code;
CBcj2Prob probs[2 + 256];
} CBcj2Dec;
/* Note:
Bcj2Dec_Init() sets (CBcj2Dec::ip = 0)
if (ip != 0) property is required, the caller must set CBcj2Dec::ip after Bcj2Dec_Init()
*/
void Bcj2Dec_Init(CBcj2Dec *p);
/* Bcj2Dec_Decode():
returns:
SZ_OK
SZ_ERROR_DATA : if data in 5 starting bytes of BCJ2_STREAM_RC stream are not correct
*/
SRes Bcj2Dec_Decode(CBcj2Dec *p);
/* To check that decoding was finished you can compare
sizes of processed streams with sizes known from another sources.
You must do at least one mandatory check from the two following options:
- the check for size of processed output (ORIG) stream.
- the check for size of processed input (MAIN) stream.
additional optional checks:
- the checks for processed sizes of all input streams (MAIN, CALL, JUMP, RC)
- the checks Bcj2Dec_IsMaybeFinished*()
also before actual decoding you can check that the
following condition is met for stream sizes:
( size(ORIG) == size(MAIN) + size(CALL) + size(JUMP) )
*/
/* (state == BCJ2_STREAM_MAIN) means that decoder is ready for
additional input data in BCJ2_STREAM_MAIN stream.
Note that (state == BCJ2_STREAM_MAIN) is allowed for non-finished decoding.
*/
#define Bcj2Dec_IsMaybeFinished_state_MAIN(_p_) ((_p_)->state == BCJ2_STREAM_MAIN)
/* if the stream decoding was finished correctly, then range decoder
part of CBcj2Dec also was finished, and then (CBcj2Dec::code == 0).
Note that (CBcj2Dec::code == 0) is allowed for non-finished decoding.
*/
#define Bcj2Dec_IsMaybeFinished_code(_p_) ((_p_)->code == 0)
/* use Bcj2Dec_IsMaybeFinished() only as additional check
after at least one mandatory check from the two following options:
- the check for size of processed output (ORIG) stream.
- the check for size of processed input (MAIN) stream.
*/
#define Bcj2Dec_IsMaybeFinished(_p_) ( \
Bcj2Dec_IsMaybeFinished_state_MAIN(_p_) && \
Bcj2Dec_IsMaybeFinished_code(_p_))
/* ---------- BCJ2 Encoder ---------- */
typedef enum
{
BCJ2_ENC_FINISH_MODE_CONTINUE,
BCJ2_ENC_FINISH_MODE_END_BLOCK,
BCJ2_ENC_FINISH_MODE_END_STREAM
} EBcj2Enc_FinishMode;
/*
BCJ2_ENC_FINISH_MODE_CONTINUE:
process non finished encoding.
It notifies the encoder that additional further calls
can provide more input data (src) than provided by current call.
In that case the CBcj2Enc encoder still can move (src) pointer
up to (srcLim), but CBcj2Enc encoder can store some of the last
processed bytes (up to 4 bytes) from src to internal CBcj2Enc::temp[] buffer.
at return:
(CBcj2Enc::src will point to position that includes
processed data and data copied to (temp[]) buffer)
That data from (temp[]) buffer will be used in further calls.
BCJ2_ENC_FINISH_MODE_END_BLOCK:
finish encoding of current block (ended at srcLim) without RC flushing.
at return: if (CBcj2Enc::state == BCJ2_ENC_STATE_ORIG) &&
CBcj2Enc::src == CBcj2Enc::srcLim)
: it shows that block encoding was finished. And the encoder is
ready for new (src) data or for stream finish operation.
finished block means
{
CBcj2Enc has completed block encoding up to (srcLim).
(1 + 4 bytes) or (2 + 4 bytes) CALL/JUMP cortages will
not cross block boundary at (srcLim).
temporary CBcj2Enc buffer for (ORIG) src data is empty.
3 output uncompressed streams (MAIN, CALL, JUMP) were flushed.
RC stream was not flushed. And RC stream will cross block boundary.
}
Note: some possible implementation of BCJ2 encoder could
write branch marker (e8/e8/0f8x) in one call of Bcj2Enc_Encode(),
and it could calculate symbol for RC in another call of Bcj2Enc_Encode().
BCJ2 encoder uses ip/fileIp/fileSize/relatLimit values to calculate RC symbol.
And these CBcj2Enc variables can have different values in different Bcj2Enc_Encode() calls.
So caller must finish each block with BCJ2_ENC_FINISH_MODE_END_BLOCK
to ensure that RC symbol is calculated and written in proper block.
BCJ2_ENC_FINISH_MODE_END_STREAM
finish encoding of stream (ended at srcLim) fully including RC flushing.
at return: if (CBcj2Enc::state == BCJ2_ENC_STATE_FINISHED)
: it shows that stream encoding was finished fully,
and all output streams were flushed fully.
also Bcj2Enc_IsFinished() can be called.
*/
/*
32-bit relative offset in JUMP/CALL commands is
- (mod 4 GiB) for 32-bit x86 code
- signed Int32 for 64-bit x86-64 code
BCJ2 encoder also does internal relative to absolute address conversions.
And there are 2 possible ways to do it:
before v23: we used 32-bit variables and (mod 4 GiB) conversion
since v23: we use 64-bit variables and (signed Int32 offset) conversion.
The absolute address condition for conversion in v23:
((UInt64)((Int64)ip64 - (Int64)fileIp64 + 5 + (Int32)offset) < (UInt64)fileSize64)
note that if (fileSize64 > 2 GiB). there is difference between
old (mod 4 GiB) way (v22) and new (signed Int32 offset) way (v23).
And new (v23) way is more suitable to encode 64-bit x86-64 code for (fileSize64 > 2 GiB) cases.
*/
/*
// for old (v22) way for conversion:
typedef UInt32 CBcj2Enc_ip_unsigned;
typedef Int32 CBcj2Enc_ip_signed;
#define BCJ2_ENC_FileSize_MAX ((UInt32)1 << 31)
*/
typedef UInt64 CBcj2Enc_ip_unsigned;
typedef Int64 CBcj2Enc_ip_signed;
/* maximum size of file that can be used for conversion condition */
#define BCJ2_ENC_FileSize_MAX ((CBcj2Enc_ip_unsigned)0 - 2)
/* default value of fileSize64_minus1 variable that means
that absolute address limitation will not be used */
#define BCJ2_ENC_FileSizeField_UNLIMITED ((CBcj2Enc_ip_unsigned)0 - 1)
/* calculate value that later can be set to CBcj2Enc::fileSize64_minus1 */
#define BCJ2_ENC_GET_FileSizeField_VAL_FROM_FileSize(fileSize) \
((CBcj2Enc_ip_unsigned)(fileSize) - 1)
/* set CBcj2Enc::fileSize64_minus1 variable from size of file */
#define Bcj2Enc_SET_FileSize(p, fileSize) \
(p)->fileSize64_minus1 = BCJ2_ENC_GET_FileSizeField_VAL_FROM_FileSize(fileSize);
typedef struct
{
Byte *bufs[BCJ2_NUM_STREAMS];
const Byte *lims[BCJ2_NUM_STREAMS];
const Byte *src;
const Byte *srcLim;
unsigned state;
EBcj2Enc_FinishMode finishMode;
Byte context;
Byte flushRem;
Byte isFlushState;
Byte cache;
UInt32 range;
UInt64 low;
UInt64 cacheSize;
// UInt32 context; // for marker version, it can include marker flag.
/* (ip64) and (fileIp64) correspond to virtual source stream position
that doesn't include data in temp[] */
CBcj2Enc_ip_unsigned ip64; /* current (ip) position */
CBcj2Enc_ip_unsigned fileIp64; /* start (ip) position of current file */
CBcj2Enc_ip_unsigned fileSize64_minus1; /* size of current file (for conversion limitation) */
UInt32 relatLimit; /* (relatLimit <= ((UInt32)1 << 31)) : 0 means disable_conversion */
// UInt32 relatExcludeBits;
UInt32 tempTarget;
unsigned tempPos; /* the number of bytes that were copied to temp[] buffer
(tempPos <= 4) outside of Bcj2Enc_Encode() */
// Byte temp[4]; // for marker version
Byte temp[8];
CBcj2Prob probs[2 + 256];
} CBcj2Enc;
void Bcj2Enc_Init(CBcj2Enc *p);
/*
Bcj2Enc_Encode(): at exit:
p->State < BCJ2_NUM_STREAMS : we need more buffer space for output stream
(bufs[p->State] == lims[p->State])
p->State == BCJ2_ENC_STATE_ORIG : we need more data in input src stream
(src == srcLim)
p->State == BCJ2_ENC_STATE_FINISHED : after fully encoded stream
*/
void Bcj2Enc_Encode(CBcj2Enc *p);
/* Bcj2Enc encoder can look ahead for up 4 bytes of source stream.
CBcj2Enc::tempPos : is the number of bytes that were copied from input stream to temp[] buffer.
(CBcj2Enc::src) after Bcj2Enc_Encode() is starting position after
fully processed data and after data copied to temp buffer.
So if the caller needs to get real number of fully processed input
bytes (without look ahead data in temp buffer),
the caller must subtruct (CBcj2Enc::tempPos) value from processed size
value that is calculated based on current (CBcj2Enc::src):
cur_processed_pos = Calc_Big_Processed_Pos(enc.src)) -
Bcj2Enc_Get_AvailInputSize_in_Temp(&enc);
*/
/* get the size of input data that was stored in temp[] buffer: */
#define Bcj2Enc_Get_AvailInputSize_in_Temp(p) ((p)->tempPos)
#define Bcj2Enc_IsFinished(p) ((p)->flushRem == 0)
/* Note : the decoder supports overlapping of marker (0f 80).
But we can eliminate such overlapping cases by setting
the limit for relative offset conversion as
CBcj2Enc::relatLimit <= (0x0f << 24) == (240 MiB)
*/
/* default value for CBcj2Enc::relatLimit */
#define BCJ2_ENC_RELAT_LIMIT_DEFAULT ((UInt32)0x0f << 24)
#define BCJ2_ENC_RELAT_LIMIT_MAX ((UInt32)1 << 31)
// #define BCJ2_RELAT_EXCLUDE_NUM_BITS 5
EXTERN_C_END
#endif

506
C/Bcj2Enc.c Executable file
View File

@@ -0,0 +1,506 @@
/* Bcj2Enc.c -- BCJ2 Encoder converter for x86 code (Branch CALL/JUMP variant2)
2023-04-02 : Igor Pavlov : Public domain */
#include "Precomp.h"
/* #define SHOW_STAT */
#ifdef SHOW_STAT
#include <stdio.h>
#define PRF2(s) printf("%s ip=%8x tempPos=%d src= %8x\n", s, (unsigned)p->ip64, p->tempPos, (unsigned)(p->srcLim - p->src));
#else
#define PRF2(s)
#endif
#include "Bcj2.h"
#include "CpuArch.h"
#define kTopValue ((UInt32)1 << 24)
#define kNumBitModelTotalBits 11
#define kBitModelTotal (1 << kNumBitModelTotalBits)
#define kNumMoveBits 5
void Bcj2Enc_Init(CBcj2Enc *p)
{
unsigned i;
p->state = BCJ2_ENC_STATE_ORIG;
p->finishMode = BCJ2_ENC_FINISH_MODE_CONTINUE;
p->context = 0;
p->flushRem = 5;
p->isFlushState = 0;
p->cache = 0;
p->range = 0xffffffff;
p->low = 0;
p->cacheSize = 1;
p->ip64 = 0;
p->fileIp64 = 0;
p->fileSize64_minus1 = BCJ2_ENC_FileSizeField_UNLIMITED;
p->relatLimit = BCJ2_ENC_RELAT_LIMIT_DEFAULT;
// p->relatExcludeBits = 0;
p->tempPos = 0;
for (i = 0; i < sizeof(p->probs) / sizeof(p->probs[0]); i++)
p->probs[i] = kBitModelTotal >> 1;
}
// Z7_NO_INLINE
Z7_FORCE_INLINE
static BoolInt Bcj2_RangeEnc_ShiftLow(CBcj2Enc *p)
{
const UInt32 low = (UInt32)p->low;
const unsigned high = (unsigned)
#if defined(Z7_MSC_VER_ORIGINAL) \
&& defined(MY_CPU_X86) \
&& defined(MY_CPU_LE) \
&& !defined(MY_CPU_64BIT)
// we try to rid of __aullshr() call in MSVS-x86
(((const UInt32 *)&p->low)[1]); // [1] : for little-endian only
#else
(p->low >> 32);
#endif
if (low < (UInt32)0xff000000 || high != 0)
{
Byte *buf = p->bufs[BCJ2_STREAM_RC];
do
{
if (buf == p->lims[BCJ2_STREAM_RC])
{
p->state = BCJ2_STREAM_RC;
p->bufs[BCJ2_STREAM_RC] = buf;
return True;
}
*buf++ = (Byte)(p->cache + high);
p->cache = 0xff;
}
while (--p->cacheSize);
p->bufs[BCJ2_STREAM_RC] = buf;
p->cache = (Byte)(low >> 24);
}
p->cacheSize++;
p->low = low << 8;
return False;
}
/*
We can use 2 alternative versions of code:
1) non-marker version:
Byte CBcj2Enc::context
Byte temp[8];
Last byte of marker (e8/e9/[0f]8x) can be written to temp[] buffer.
Encoder writes last byte of marker (e8/e9/[0f]8x) to dest, only in conjunction
with writing branch symbol to range coder in same Bcj2Enc_Encode_2() call.
2) marker version:
UInt32 CBcj2Enc::context
Byte CBcj2Enc::temp[4];
MARKER_FLAG in CBcj2Enc::context shows that CBcj2Enc::context contains finded marker.
it's allowed that
one call of Bcj2Enc_Encode_2() writes last byte of marker (e8/e9/[0f]8x) to dest,
and another call of Bcj2Enc_Encode_2() does offset conversion.
So different values of (fileIp) and (fileSize) are possible
in these different Bcj2Enc_Encode_2() calls.
Also marker version requires additional if((v & MARKER_FLAG) == 0) check in main loop.
So we use non-marker version.
*/
/*
Corner cases with overlap in multi-block.
before v23: there was one corner case, where converted instruction
could start in one sub-stream and finish in next sub-stream.
If multi-block (solid) encoding is used,
and BCJ2_ENC_FINISH_MODE_END_BLOCK is used for each sub-stream.
and (0f) is last byte of previous sub-stream
and (8x) is first byte of current sub-stream
then (0f 8x) pair is treated as marker by BCJ2 encoder and decoder.
BCJ2 encoder can converts 32-bit offset for that (0f 8x) cortage,
if that offset meets limit requirements.
If encoder allows 32-bit offset conversion for such overlap case,
then the data in 3 uncompressed BCJ2 streams for some sub-stream
can depend from data of previous sub-stream.
That corner case is not big problem, and it's rare case.
Since v23.00 we do additional check to prevent conversions in such overlap cases.
*/
/*
Bcj2Enc_Encode_2() output variables at exit:
{
if (Bcj2Enc_Encode_2() exits with (p->state == BCJ2_ENC_STATE_ORIG))
{
it means that encoder needs more input data.
if (p->srcLim == p->src) at exit, then
{
(p->finishMode != BCJ2_ENC_FINISH_MODE_END_STREAM)
all input data were read and processed, and we are ready for
new input data.
}
else
{
(p->srcLim != p->src)
(p->finishMode == BCJ2_ENC_FINISH_MODE_CONTINUE)
The encoder have found e8/e9/0f_8x marker,
and p->src points to last byte of that marker,
Bcj2Enc_Encode_2() needs more input data to get totally
5 bytes (last byte of marker and 32-bit branch offset)
as continuous array starting from p->src.
(p->srcLim - p->src < 5) requirement is met after exit.
So non-processed resedue from p->src to p->srcLim is always less than 5 bytes.
}
}
}
*/
Z7_NO_INLINE
static void Bcj2Enc_Encode_2(CBcj2Enc *p)
{
if (!p->isFlushState)
{
const Byte *src;
UInt32 v;
{
const unsigned state = p->state;
if (BCJ2_IS_32BIT_STREAM(state))
{
Byte *cur = p->bufs[state];
if (cur == p->lims[state])
return;
SetBe32a(cur, p->tempTarget)
p->bufs[state] = cur + 4;
}
}
p->state = BCJ2_ENC_STATE_ORIG; // for main reason of exit
src = p->src;
v = p->context;
// #define WRITE_CONTEXT p->context = v; // for marker version
#define WRITE_CONTEXT p->context = (Byte)v;
#define WRITE_CONTEXT_AND_SRC p->src = src; WRITE_CONTEXT
for (;;)
{
// const Byte *src;
// UInt32 v;
CBcj2Enc_ip_unsigned ip;
if (p->range < kTopValue)
{
// to reduce register pressure and code size: we save and restore local variables.
WRITE_CONTEXT_AND_SRC
if (Bcj2_RangeEnc_ShiftLow(p))
return;
p->range <<= 8;
src = p->src;
v = p->context;
}
// src = p->src;
// #define MARKER_FLAG ((UInt32)1 << 17)
// if ((v & MARKER_FLAG) == 0) // for marker version
{
const Byte *srcLim;
Byte *dest = p->bufs[BCJ2_STREAM_MAIN];
{
const SizeT remSrc = (SizeT)(p->srcLim - src);
SizeT rem = (SizeT)(p->lims[BCJ2_STREAM_MAIN] - dest);
if (rem >= remSrc)
rem = remSrc;
srcLim = src + rem;
}
/* p->context contains context of previous byte:
bits [0 : 7] : src[-1], if (src) was changed in this call
bits [8 : 31] : are undefined for non-marker version
*/
// v = p->context;
#define NUM_SHIFT_BITS 24
#define CONV_FLAG ((UInt32)1 << 16)
#define ONE_ITER { \
b = src[0]; \
*dest++ = (Byte)b; \
v = (v << NUM_SHIFT_BITS) | b; \
if (((b + (0x100 - 0xe8)) & 0xfe) == 0) break; \
if (((v - (((UInt32)0x0f << (NUM_SHIFT_BITS)) + 0x80)) & \
((((UInt32)1 << (4 + NUM_SHIFT_BITS)) - 0x1) << 4)) == 0) break; \
src++; if (src == srcLim) { break; } }
if (src != srcLim)
for (;;)
{
/* clang can generate ineffective code with setne instead of two jcc instructions.
we can use 2 iterations and external (unsigned b) to avoid that ineffective code genaration. */
unsigned b;
ONE_ITER
ONE_ITER
}
ip = p->ip64 + (CBcj2Enc_ip_unsigned)(SizeT)(dest - p->bufs[BCJ2_STREAM_MAIN]);
p->bufs[BCJ2_STREAM_MAIN] = dest;
p->ip64 = ip;
if (src == srcLim)
{
WRITE_CONTEXT_AND_SRC
if (src != p->srcLim)
{
p->state = BCJ2_STREAM_MAIN;
return;
}
/* (p->src == p->srcLim)
(p->state == BCJ2_ENC_STATE_ORIG) */
if (p->finishMode != BCJ2_ENC_FINISH_MODE_END_STREAM)
return;
/* (p->finishMode == BCJ2_ENC_FINISH_MODE_END_STREAM */
// (p->flushRem == 5);
p->isFlushState = 1;
break;
}
src++;
// p->src = src;
}
// ip = p->ip; // for marker version
/* marker was found */
/* (v) contains marker that was found:
bits [NUM_SHIFT_BITS : NUM_SHIFT_BITS + 7]
: value of src[-2] : xx/xx/0f
bits [0 : 7] : value of src[-1] : e8/e9/8x
*/
{
{
#if NUM_SHIFT_BITS != 24
v &= ~(UInt32)CONV_FLAG;
#endif
// UInt32 relat = 0;
if ((SizeT)(p->srcLim - src) >= 4)
{
/*
if (relat != 0 || (Byte)v != 0xe8)
BoolInt isBigOffset = True;
*/
const UInt32 relat = GetUi32(src);
/*
#define EXCLUDE_FLAG ((UInt32)1 << 4)
#define NEED_CONVERT(rel) ((((rel) + EXCLUDE_FLAG) & (0 - EXCLUDE_FLAG * 2)) != 0)
if (p->relatExcludeBits != 0)
{
const UInt32 flag = (UInt32)1 << (p->relatExcludeBits - 1);
isBigOffset = (((relat + flag) & (0 - flag * 2)) != 0);
}
// isBigOffset = False; // for debug
*/
ip -= p->fileIp64;
// Use the following if check, if (ip) is 64-bit:
if (ip > (((v + 0x20) >> 5) & 1)) // 23.00 : we eliminate milti-block overlap for (Of 80) and (e8/e9)
if ((CBcj2Enc_ip_unsigned)((CBcj2Enc_ip_signed)ip + 4 + (Int32)relat) <= p->fileSize64_minus1)
if (((UInt32)(relat + p->relatLimit) >> 1) < p->relatLimit)
v |= CONV_FLAG;
}
else if (p->finishMode == BCJ2_ENC_FINISH_MODE_CONTINUE)
{
// (p->srcLim - src < 4)
// /*
// for non-marker version
p->ip64--; // p->ip = ip - 1;
p->bufs[BCJ2_STREAM_MAIN]--;
src--;
v >>= NUM_SHIFT_BITS;
// (0 < p->srcLim - p->src <= 4)
// */
// v |= MARKER_FLAG; // for marker version
/* (p->state == BCJ2_ENC_STATE_ORIG) */
WRITE_CONTEXT_AND_SRC
return;
}
{
const unsigned c = ((v + 0x17) >> 6) & 1;
CBcj2Prob *prob = p->probs + (unsigned)
(((0 - c) & (Byte)(v >> NUM_SHIFT_BITS)) + c + ((v >> 5) & 1));
/*
((Byte)v == 0xe8 ? 2 + ((Byte)(v >> 8)) :
((Byte)v < 0xe8 ? 0 : 1)); // ((v >> 5) & 1));
*/
const unsigned ttt = *prob;
const UInt32 bound = (p->range >> kNumBitModelTotalBits) * ttt;
if ((v & CONV_FLAG) == 0)
{
// static int yyy = 0; yyy++; printf("\n!needConvert = %d\n", yyy);
// v = (Byte)v; // for marker version
p->range = bound;
*prob = (CBcj2Prob)(ttt + ((kBitModelTotal - ttt) >> kNumMoveBits));
// WRITE_CONTEXT_AND_SRC
continue;
}
p->low += bound;
p->range -= bound;
*prob = (CBcj2Prob)(ttt - (ttt >> kNumMoveBits));
}
// p->context = src[3];
{
// const unsigned cj = ((Byte)v == 0xe8 ? BCJ2_STREAM_CALL : BCJ2_STREAM_JUMP);
const unsigned cj = (((v + 0x57) >> 6) & 1) + BCJ2_STREAM_CALL;
ip = p->ip64;
v = GetUi32(src); // relat
ip += 4;
p->ip64 = ip;
src += 4;
// p->src = src;
{
const UInt32 absol = (UInt32)ip + v;
Byte *cur = p->bufs[cj];
v >>= 24;
// WRITE_CONTEXT
if (cur == p->lims[cj])
{
p->state = cj;
p->tempTarget = absol;
WRITE_CONTEXT_AND_SRC
return;
}
SetBe32a(cur, absol)
p->bufs[cj] = cur + 4;
}
}
}
}
} // end of loop
}
for (; p->flushRem != 0; p->flushRem--)
if (Bcj2_RangeEnc_ShiftLow(p))
return;
p->state = BCJ2_ENC_STATE_FINISHED;
}
/*
BCJ2 encoder needs look ahead for up to 4 bytes in (src) buffer.
So base function Bcj2Enc_Encode_2()
in BCJ2_ENC_FINISH_MODE_CONTINUE mode can return with
(p->state == BCJ2_ENC_STATE_ORIG && p->src < p->srcLim)
Bcj2Enc_Encode() solves that look ahead problem by using p->temp[] buffer.
so if (p->state == BCJ2_ENC_STATE_ORIG) after Bcj2Enc_Encode(),
then (p->src == p->srcLim).
And the caller's code is simpler with Bcj2Enc_Encode().
*/
Z7_NO_INLINE
void Bcj2Enc_Encode(CBcj2Enc *p)
{
PRF2("\n----")
if (p->tempPos != 0)
{
/* extra: number of bytes that were copied from (src) to (temp) buffer in this call */
unsigned extra = 0;
/* We will touch only minimal required number of bytes in input (src) stream.
So we will add input bytes from (src) stream to temp[] with step of 1 byte.
We don't add new bytes to temp[] before Bcj2Enc_Encode_2() call
in first loop iteration because
- previous call of Bcj2Enc_Encode() could use another (finishMode),
- previous call could finish with (p->state != BCJ2_ENC_STATE_ORIG).
the case with full temp[] buffer (p->tempPos == 4) is possible here.
*/
for (;;)
{
// (0 < p->tempPos <= 5) // in non-marker version
/* p->src : the current src data position including extra bytes
that were copied to temp[] buffer in this call */
const Byte *src = p->src;
const Byte *srcLim = p->srcLim;
const EBcj2Enc_FinishMode finishMode = p->finishMode;
if (src != srcLim)
{
/* if there are some src data after the data copied to temp[],
then we use MODE_CONTINUE for temp data */
p->finishMode = BCJ2_ENC_FINISH_MODE_CONTINUE;
}
p->src = p->temp;
p->srcLim = p->temp + p->tempPos;
PRF2(" ")
Bcj2Enc_Encode_2(p);
{
const unsigned num = (unsigned)(p->src - p->temp);
const unsigned tempPos = p->tempPos - num;
unsigned i;
p->tempPos = tempPos;
for (i = 0; i < tempPos; i++)
p->temp[i] = p->temp[(SizeT)i + num];
// tempPos : number of bytes in temp buffer
p->src = src;
p->srcLim = srcLim;
p->finishMode = finishMode;
if (p->state != BCJ2_ENC_STATE_ORIG)
{
// (p->tempPos <= 4) // in non-marker version
/* if (the reason of exit from Bcj2Enc_Encode_2()
is not BCJ2_ENC_STATE_ORIG),
then we exit from Bcj2Enc_Encode() with same reason */
// optional code begin : we rollback (src) and tempPos, if it's possible:
if (extra >= tempPos)
extra = tempPos;
p->src = src - extra;
p->tempPos = tempPos - extra;
// optional code end : rollback of (src) and tempPos
return;
}
/* (p->tempPos <= 4)
(p->state == BCJ2_ENC_STATE_ORIG)
so encoder needs more data than in temp[] */
if (src == srcLim)
return; // src buffer has no more input data.
/* (src != srcLim)
so we can provide more input data from src for Bcj2Enc_Encode_2() */
if (extra >= tempPos)
{
/* (extra >= tempPos) means that temp buffer contains
only data from src buffer of this call.
So now we can encode without temp buffer */
p->src = src - tempPos; // rollback (src)
p->tempPos = 0;
break;
}
// we append one additional extra byte from (src) to temp[] buffer:
p->temp[tempPos] = *src;
p->tempPos = tempPos + 1;
// (0 < p->tempPos <= 5) // in non-marker version
p->src = src + 1;
extra++;
}
}
}
PRF2("++++")
// (p->tempPos == 0)
Bcj2Enc_Encode_2(p);
PRF2("====")
if (p->state == BCJ2_ENC_STATE_ORIG)
{
const Byte *src = p->src;
const Byte *srcLim = p->srcLim;
const unsigned rem = (unsigned)(srcLim - src);
/* (rem <= 4) here.
if (p->src != p->srcLim), then
- we copy non-processed bytes from (p->src) to temp[] buffer,
- we set p->src equal to p->srcLim.
*/
if (rem)
{
unsigned i = 0;
p->src = srcLim;
p->tempPos = rem;
// (0 < p->tempPos <= 4)
do
p->temp[i] = src[i];
while (++i != rem);
}
// (p->tempPos <= 4)
// (p->src == p->srcLim)
}
}
#undef PRF2
#undef CONV_FLAG
#undef MARKER_FLAG
#undef WRITE_CONTEXT
#undef WRITE_CONTEXT_AND_SRC
#undef ONE_ITER
#undef NUM_SHIFT_BITS
#undef kTopValue
#undef kNumBitModelTotalBits
#undef kBitModelTotal
#undef kNumMoveBits

48
C/Blake2.h Executable file
View File

@@ -0,0 +1,48 @@
/* Blake2.h -- BLAKE2 Hash
2023-03-04 : Igor Pavlov : Public domain
2015 : Samuel Neves : Public domain */
#ifndef ZIP7_INC_BLAKE2_H
#define ZIP7_INC_BLAKE2_H
#include "7zTypes.h"
EXTERN_C_BEGIN
#define BLAKE2S_BLOCK_SIZE 64
#define BLAKE2S_DIGEST_SIZE 32
#define BLAKE2SP_PARALLEL_DEGREE 8
typedef struct
{
UInt32 h[8];
UInt32 t[2];
UInt32 f[2];
Byte buf[BLAKE2S_BLOCK_SIZE];
UInt32 bufPos;
UInt32 lastNode_f1;
UInt32 dummy[2]; /* for sizeof(CBlake2s) alignment */
} CBlake2s;
/* You need to xor CBlake2s::h[i] with input parameter block after Blake2s_Init0() */
/*
void Blake2s_Init0(CBlake2s *p);
void Blake2s_Update(CBlake2s *p, const Byte *data, size_t size);
void Blake2s_Final(CBlake2s *p, Byte *digest);
*/
typedef struct
{
CBlake2s S[BLAKE2SP_PARALLEL_DEGREE];
unsigned bufPos;
} CBlake2sp;
void Blake2sp_Init(CBlake2sp *p);
void Blake2sp_Update(CBlake2sp *p, const Byte *data, size_t size);
void Blake2sp_Final(CBlake2sp *p, Byte *digest);
EXTERN_C_END
#endif

250
C/Blake2s.c Executable file
View File

@@ -0,0 +1,250 @@
/* Blake2s.c -- BLAKE2s and BLAKE2sp Hash
2023-03-04 : Igor Pavlov : Public domain
2015 : Samuel Neves : Public domain */
#include "Precomp.h"
#include <string.h>
#include "Blake2.h"
#include "CpuArch.h"
#include "RotateDefs.h"
#define rotr32 rotrFixed
#define BLAKE2S_NUM_ROUNDS 10
#define BLAKE2S_FINAL_FLAG (~(UInt32)0)
static const UInt32 k_Blake2s_IV[8] =
{
0x6A09E667UL, 0xBB67AE85UL, 0x3C6EF372UL, 0xA54FF53AUL,
0x510E527FUL, 0x9B05688CUL, 0x1F83D9ABUL, 0x5BE0CD19UL
};
static const Byte k_Blake2s_Sigma[BLAKE2S_NUM_ROUNDS][16] =
{
{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 } ,
{ 14, 10, 4, 8, 9, 15, 13, 6, 1, 12, 0, 2, 11, 7, 5, 3 } ,
{ 11, 8, 12, 0, 5, 2, 15, 13, 10, 14, 3, 6, 7, 1, 9, 4 } ,
{ 7, 9, 3, 1, 13, 12, 11, 14, 2, 6, 5, 10, 4, 0, 15, 8 } ,
{ 9, 0, 5, 7, 2, 4, 10, 15, 14, 1, 11, 12, 6, 8, 3, 13 } ,
{ 2, 12, 6, 10, 0, 11, 8, 3, 4, 13, 7, 5, 15, 14, 1, 9 } ,
{ 12, 5, 1, 15, 14, 13, 4, 10, 0, 7, 6, 3, 9, 2, 8, 11 } ,
{ 13, 11, 7, 14, 12, 1, 3, 9, 5, 0, 15, 4, 8, 6, 2, 10 } ,
{ 6, 15, 14, 9, 11, 3, 0, 8, 12, 2, 13, 7, 1, 4, 10, 5 } ,
{ 10, 2, 8, 4, 7, 6, 1, 5, 15, 11, 9, 14, 3, 12, 13 , 0 } ,
};
static void Blake2s_Init0(CBlake2s *p)
{
unsigned i;
for (i = 0; i < 8; i++)
p->h[i] = k_Blake2s_IV[i];
p->t[0] = 0;
p->t[1] = 0;
p->f[0] = 0;
p->f[1] = 0;
p->bufPos = 0;
p->lastNode_f1 = 0;
}
static void Blake2s_Compress(CBlake2s *p)
{
UInt32 m[16];
UInt32 v[16];
{
unsigned i;
for (i = 0; i < 16; i++)
m[i] = GetUi32(p->buf + i * sizeof(m[i]));
for (i = 0; i < 8; i++)
v[i] = p->h[i];
}
v[ 8] = k_Blake2s_IV[0];
v[ 9] = k_Blake2s_IV[1];
v[10] = k_Blake2s_IV[2];
v[11] = k_Blake2s_IV[3];
v[12] = p->t[0] ^ k_Blake2s_IV[4];
v[13] = p->t[1] ^ k_Blake2s_IV[5];
v[14] = p->f[0] ^ k_Blake2s_IV[6];
v[15] = p->f[1] ^ k_Blake2s_IV[7];
#define G(r,i,a,b,c,d) \
a += b + m[sigma[2*i+0]]; d ^= a; d = rotr32(d, 16); c += d; b ^= c; b = rotr32(b, 12); \
a += b + m[sigma[2*i+1]]; d ^= a; d = rotr32(d, 8); c += d; b ^= c; b = rotr32(b, 7); \
#define R(r) \
G(r,0,v[ 0],v[ 4],v[ 8],v[12]) \
G(r,1,v[ 1],v[ 5],v[ 9],v[13]) \
G(r,2,v[ 2],v[ 6],v[10],v[14]) \
G(r,3,v[ 3],v[ 7],v[11],v[15]) \
G(r,4,v[ 0],v[ 5],v[10],v[15]) \
G(r,5,v[ 1],v[ 6],v[11],v[12]) \
G(r,6,v[ 2],v[ 7],v[ 8],v[13]) \
G(r,7,v[ 3],v[ 4],v[ 9],v[14]) \
{
unsigned r;
for (r = 0; r < BLAKE2S_NUM_ROUNDS; r++)
{
const Byte *sigma = k_Blake2s_Sigma[r];
R(r)
}
/* R(0); R(1); R(2); R(3); R(4); R(5); R(6); R(7); R(8); R(9); */
}
#undef G
#undef R
{
unsigned i;
for (i = 0; i < 8; i++)
p->h[i] ^= v[i] ^ v[i + 8];
}
}
#define Blake2s_Increment_Counter(S, inc) \
{ p->t[0] += (inc); p->t[1] += (p->t[0] < (inc)); }
#define Blake2s_Set_LastBlock(p) \
{ p->f[0] = BLAKE2S_FINAL_FLAG; p->f[1] = p->lastNode_f1; }
static void Blake2s_Update(CBlake2s *p, const Byte *data, size_t size)
{
while (size != 0)
{
unsigned pos = (unsigned)p->bufPos;
unsigned rem = BLAKE2S_BLOCK_SIZE - pos;
if (size <= rem)
{
memcpy(p->buf + pos, data, size);
p->bufPos += (UInt32)size;
return;
}
memcpy(p->buf + pos, data, rem);
Blake2s_Increment_Counter(S, BLAKE2S_BLOCK_SIZE)
Blake2s_Compress(p);
p->bufPos = 0;
data += rem;
size -= rem;
}
}
static void Blake2s_Final(CBlake2s *p, Byte *digest)
{
unsigned i;
Blake2s_Increment_Counter(S, (UInt32)p->bufPos)
Blake2s_Set_LastBlock(p)
memset(p->buf + p->bufPos, 0, BLAKE2S_BLOCK_SIZE - p->bufPos);
Blake2s_Compress(p);
for (i = 0; i < 8; i++)
{
SetUi32(digest + sizeof(p->h[i]) * i, p->h[i])
}
}
/* ---------- BLAKE2s ---------- */
/* we need to xor CBlake2s::h[i] with input parameter block after Blake2s_Init0() */
/*
typedef struct
{
Byte digest_length;
Byte key_length;
Byte fanout;
Byte depth;
UInt32 leaf_length;
Byte node_offset[6];
Byte node_depth;
Byte inner_length;
Byte salt[BLAKE2S_SALTBYTES];
Byte personal[BLAKE2S_PERSONALBYTES];
} CBlake2sParam;
*/
static void Blake2sp_Init_Spec(CBlake2s *p, unsigned node_offset, unsigned node_depth)
{
Blake2s_Init0(p);
p->h[0] ^= (BLAKE2S_DIGEST_SIZE | ((UInt32)BLAKE2SP_PARALLEL_DEGREE << 16) | ((UInt32)2 << 24));
p->h[2] ^= ((UInt32)node_offset);
p->h[3] ^= ((UInt32)node_depth << 16) | ((UInt32)BLAKE2S_DIGEST_SIZE << 24);
/*
P->digest_length = BLAKE2S_DIGEST_SIZE;
P->key_length = 0;
P->fanout = BLAKE2SP_PARALLEL_DEGREE;
P->depth = 2;
P->leaf_length = 0;
store48(P->node_offset, node_offset);
P->node_depth = node_depth;
P->inner_length = BLAKE2S_DIGEST_SIZE;
*/
}
void Blake2sp_Init(CBlake2sp *p)
{
unsigned i;
p->bufPos = 0;
for (i = 0; i < BLAKE2SP_PARALLEL_DEGREE; i++)
Blake2sp_Init_Spec(&p->S[i], i, 0);
p->S[BLAKE2SP_PARALLEL_DEGREE - 1].lastNode_f1 = BLAKE2S_FINAL_FLAG;
}
void Blake2sp_Update(CBlake2sp *p, const Byte *data, size_t size)
{
unsigned pos = p->bufPos;
while (size != 0)
{
unsigned index = pos / BLAKE2S_BLOCK_SIZE;
unsigned rem = BLAKE2S_BLOCK_SIZE - (pos & (BLAKE2S_BLOCK_SIZE - 1));
if (rem > size)
rem = (unsigned)size;
Blake2s_Update(&p->S[index], data, rem);
size -= rem;
data += rem;
pos += rem;
pos &= (BLAKE2S_BLOCK_SIZE * BLAKE2SP_PARALLEL_DEGREE - 1);
}
p->bufPos = pos;
}
void Blake2sp_Final(CBlake2sp *p, Byte *digest)
{
CBlake2s R;
unsigned i;
Blake2sp_Init_Spec(&R, 0, 1);
R.lastNode_f1 = BLAKE2S_FINAL_FLAG;
for (i = 0; i < BLAKE2SP_PARALLEL_DEGREE; i++)
{
Byte hash[BLAKE2S_DIGEST_SIZE];
Blake2s_Final(&p->S[i], hash);
Blake2s_Update(&R, hash, BLAKE2S_DIGEST_SIZE);
}
Blake2s_Final(&R, digest);
}
#undef rotr32

493
C/Bra.c
View File

@@ -1,133 +1,420 @@
/* Bra.c -- Converters for RISC code
2008-10-04 : Igor Pavlov : Public domain */
/* Bra.c -- Branch converters for RISC code
2023-04-02 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include "Bra.h"
#include "CpuArch.h"
#include "RotateDefs.h"
SizeT ARM_Convert(Byte *data, SizeT size, UInt32 ip, int encoding)
#if defined(MY_CPU_SIZEOF_POINTER) \
&& ( MY_CPU_SIZEOF_POINTER == 4 \
|| MY_CPU_SIZEOF_POINTER == 8)
#define BR_CONV_USE_OPT_PC_PTR
#endif
#ifdef BR_CONV_USE_OPT_PC_PTR
#define BR_PC_INIT pc -= (UInt32)(SizeT)p;
#define BR_PC_GET (pc + (UInt32)(SizeT)p)
#else
#define BR_PC_INIT pc += (UInt32)size;
#define BR_PC_GET (pc - (UInt32)(SizeT)(lim - p))
// #define BR_PC_INIT
// #define BR_PC_GET (pc + (UInt32)(SizeT)(p - data))
#endif
#define BR_CONVERT_VAL(v, c) if (encoding) v += c; else v -= c;
// #define BR_CONVERT_VAL(v, c) if (!encoding) c = (UInt32)0 - c; v += c;
#define Z7_BRANCH_CONV(name) z7_BranchConv_ ## name
#define Z7_BRANCH_FUNC_MAIN(name) \
static \
Z7_FORCE_INLINE \
Z7_ATTRIB_NO_VECTOR \
Byte *Z7_BRANCH_CONV(name)(Byte *p, SizeT size, UInt32 pc, int encoding)
#define Z7_BRANCH_FUNC_IMP(name, m, encoding) \
Z7_NO_INLINE \
Z7_ATTRIB_NO_VECTOR \
Byte *m(name)(Byte *data, SizeT size, UInt32 pc) \
{ return Z7_BRANCH_CONV(name)(data, size, pc, encoding); } \
#ifdef Z7_EXTRACT_ONLY
#define Z7_BRANCH_FUNCS_IMP(name) \
Z7_BRANCH_FUNC_IMP(name, Z7_BRANCH_CONV_DEC, 0)
#else
#define Z7_BRANCH_FUNCS_IMP(name) \
Z7_BRANCH_FUNC_IMP(name, Z7_BRANCH_CONV_DEC, 0) \
Z7_BRANCH_FUNC_IMP(name, Z7_BRANCH_CONV_ENC, 1)
#endif
#if defined(__clang__)
#define BR_EXTERNAL_FOR
#define BR_NEXT_ITERATION continue;
#else
#define BR_EXTERNAL_FOR for (;;)
#define BR_NEXT_ITERATION break;
#endif
#if defined(__clang__) && (__clang_major__ >= 8) \
|| defined(__GNUC__) && (__GNUC__ >= 1000) \
// GCC is not good for __builtin_expect() here
/* || defined(_MSC_VER) && (_MSC_VER >= 1920) */
// #define Z7_unlikely [[unlikely]]
// #define Z7_LIKELY(x) (__builtin_expect((x), 1))
#define Z7_UNLIKELY(x) (__builtin_expect((x), 0))
// #define Z7_likely [[likely]]
#else
// #define Z7_LIKELY(x) (x)
#define Z7_UNLIKELY(x) (x)
// #define Z7_likely
#endif
Z7_BRANCH_FUNC_MAIN(ARM64)
{
SizeT i;
if (size < 4)
return 0;
size -= 4;
ip += 8;
for (i = 0; i <= size; i += 4)
// Byte *p = data;
const Byte *lim;
const UInt32 flag = (UInt32)1 << (24 - 4);
const UInt32 mask = ((UInt32)1 << 24) - (flag << 1);
size &= ~(SizeT)3;
// if (size == 0) return p;
lim = p + size;
BR_PC_INIT
pc -= 4; // because (p) will point to next instruction
BR_EXTERNAL_FOR
{
if (data[i + 3] == 0xEB)
// Z7_PRAGMA_OPT_DISABLE_LOOP_UNROLL_VECTORIZE
for (;;)
{
UInt32 dest;
UInt32 src = ((UInt32)data[i + 2] << 16) | ((UInt32)data[i + 1] << 8) | (data[i + 0]);
src <<= 2;
if (encoding)
dest = ip + (UInt32)i + src;
else
dest = src - (ip + (UInt32)i);
dest >>= 2;
data[i + 2] = (Byte)(dest >> 16);
data[i + 1] = (Byte)(dest >> 8);
data[i + 0] = (Byte)dest;
UInt32 v;
if Z7_UNLIKELY(p == lim)
return p;
v = GetUi32a(p);
p += 4;
if Z7_UNLIKELY(((v - 0x94000000) & 0xfc000000) == 0)
{
UInt32 c = BR_PC_GET >> 2;
BR_CONVERT_VAL(v, c)
v &= 0x03ffffff;
v |= 0x94000000;
SetUi32a(p - 4, v)
BR_NEXT_ITERATION
}
// v = rotlFixed(v, 8); v += (flag << 8) - 0x90; if Z7_UNLIKELY((v & ((mask << 8) + 0x9f)) == 0)
v -= 0x90000000; if Z7_UNLIKELY((v & 0x9f000000) == 0)
{
UInt32 z, c;
// v = rotrFixed(v, 8);
v += flag; if Z7_UNLIKELY(v & mask) continue;
z = (v & 0xffffffe0) | (v >> 26);
c = (BR_PC_GET >> (12 - 3)) & ~(UInt32)7;
BR_CONVERT_VAL(z, c)
v &= 0x1f;
v |= 0x90000000;
v |= z << 26;
v |= 0x00ffffe0 & ((z & (((flag << 1) - 1))) - flag);
SetUi32a(p - 4, v)
}
}
}
return i;
}
Z7_BRANCH_FUNCS_IMP(ARM64)
SizeT ARMT_Convert(Byte *data, SizeT size, UInt32 ip, int encoding)
Z7_BRANCH_FUNC_MAIN(ARM)
{
SizeT i;
if (size < 4)
return 0;
size -= 4;
ip += 4;
for (i = 0; i <= size; i += 2)
// Byte *p = data;
const Byte *lim;
size &= ~(SizeT)3;
lim = p + size;
BR_PC_INIT
/* in ARM: branch offset is relative to the +2 instructions from current instruction.
(p) will point to next instruction */
pc += 8 - 4;
for (;;)
{
if ((data[i + 1] & 0xF8) == 0xF0 &&
(data[i + 3] & 0xF8) == 0xF8)
for (;;)
{
UInt32 dest;
UInt32 src =
(((UInt32)data[i + 1] & 0x7) << 19) |
((UInt32)data[i + 0] << 11) |
(((UInt32)data[i + 3] & 0x7) << 8) |
(data[i + 2]);
src <<= 1;
if (encoding)
dest = ip + (UInt32)i + src;
else
dest = src - (ip + (UInt32)i);
dest >>= 1;
data[i + 1] = (Byte)(0xF0 | ((dest >> 19) & 0x7));
data[i + 0] = (Byte)(dest >> 11);
data[i + 3] = (Byte)(0xF8 | ((dest >> 8) & 0x7));
data[i + 2] = (Byte)dest;
i += 2;
if Z7_UNLIKELY(p >= lim) { return p; } p += 4; if Z7_UNLIKELY(p[-1] == 0xeb) break;
if Z7_UNLIKELY(p >= lim) { return p; } p += 4; if Z7_UNLIKELY(p[-1] == 0xeb) break;
}
{
UInt32 v = GetUi32a(p - 4);
UInt32 c = BR_PC_GET >> 2;
BR_CONVERT_VAL(v, c)
v &= 0x00ffffff;
v |= 0xeb000000;
SetUi32a(p - 4, v)
}
}
return i;
}
Z7_BRANCH_FUNCS_IMP(ARM)
SizeT PPC_Convert(Byte *data, SizeT size, UInt32 ip, int encoding)
Z7_BRANCH_FUNC_MAIN(PPC)
{
SizeT i;
if (size < 4)
return 0;
size -= 4;
for (i = 0; i <= size; i += 4)
// Byte *p = data;
const Byte *lim;
size &= ~(SizeT)3;
lim = p + size;
BR_PC_INIT
pc -= 4; // because (p) will point to next instruction
for (;;)
{
if ((data[i] >> 2) == 0x12 && (data[i + 3] & 3) == 1)
UInt32 v;
for (;;)
{
UInt32 src = ((UInt32)(data[i + 0] & 3) << 24) |
((UInt32)data[i + 1] << 16) |
((UInt32)data[i + 2] << 8) |
((UInt32)data[i + 3] & (~3));
UInt32 dest;
if (encoding)
dest = ip + (UInt32)i + src;
else
dest = src - (ip + (UInt32)i);
data[i + 0] = (Byte)(0x48 | ((dest >> 24) & 0x3));
data[i + 1] = (Byte)(dest >> 16);
data[i + 2] = (Byte)(dest >> 8);
data[i + 3] &= 0x3;
data[i + 3] |= dest;
if Z7_UNLIKELY(p == lim)
return p;
// v = GetBe32a(p);
v = *(UInt32 *)(void *)p;
p += 4;
// if ((v & 0xfc000003) == 0x48000001) break;
// if ((p[-4] & 0xFC) == 0x48 && (p[-1] & 3) == 1) break;
if Z7_UNLIKELY(
((v - Z7_CONV_BE_TO_NATIVE_CONST32(0x48000001))
& Z7_CONV_BE_TO_NATIVE_CONST32(0xfc000003)) == 0) break;
}
{
v = Z7_CONV_NATIVE_TO_BE_32(v);
{
UInt32 c = BR_PC_GET;
BR_CONVERT_VAL(v, c)
}
v &= 0x03ffffff;
v |= 0x48000000;
SetBe32a(p - 4, v)
}
}
return i;
}
Z7_BRANCH_FUNCS_IMP(PPC)
SizeT SPARC_Convert(Byte *data, SizeT size, UInt32 ip, int encoding)
#ifdef Z7_CPU_FAST_ROTATE_SUPPORTED
#define BR_SPARC_USE_ROTATE
#endif
Z7_BRANCH_FUNC_MAIN(SPARC)
{
UInt32 i;
if (size < 4)
return 0;
size -= 4;
for (i = 0; i <= size; i += 4)
// Byte *p = data;
const Byte *lim;
const UInt32 flag = (UInt32)1 << 22;
size &= ~(SizeT)3;
lim = p + size;
BR_PC_INIT
pc -= 4; // because (p) will point to next instruction
for (;;)
{
if (data[i] == 0x40 && (data[i + 1] & 0xC0) == 0x00 ||
data[i] == 0x7F && (data[i + 1] & 0xC0) == 0xC0)
UInt32 v;
for (;;)
{
UInt32 src =
((UInt32)data[i + 0] << 24) |
((UInt32)data[i + 1] << 16) |
((UInt32)data[i + 2] << 8) |
((UInt32)data[i + 3]);
UInt32 dest;
src <<= 2;
if (encoding)
dest = ip + i + src;
else
dest = src - (ip + i);
dest >>= 2;
dest = (((0 - ((dest >> 22) & 1)) << 22) & 0x3FFFFFFF) | (dest & 0x3FFFFF) | 0x40000000;
data[i + 0] = (Byte)(dest >> 24);
data[i + 1] = (Byte)(dest >> 16);
data[i + 2] = (Byte)(dest >> 8);
data[i + 3] = (Byte)dest;
if Z7_UNLIKELY(p == lim)
return p;
/* // the code without GetBe32a():
{ const UInt32 v = GetUi16a(p) & 0xc0ff; p += 4; if (v == 0x40 || v == 0xc07f) break; }
*/
v = GetBe32a(p);
p += 4;
#ifdef BR_SPARC_USE_ROTATE
v = rotlFixed(v, 2);
v += (flag << 2) - 1;
if Z7_UNLIKELY((v & (3 - (flag << 3))) == 0)
#else
v += (UInt32)5 << 29;
v ^= (UInt32)7 << 29;
v += flag;
if Z7_UNLIKELY((v & (0 - (flag << 1))) == 0)
#endif
break;
}
{
// UInt32 v = GetBe32a(p - 4);
#ifndef BR_SPARC_USE_ROTATE
v <<= 2;
#endif
{
UInt32 c = BR_PC_GET;
BR_CONVERT_VAL(v, c)
}
v &= (flag << 3) - 1;
#ifdef BR_SPARC_USE_ROTATE
v -= (flag << 2) - 1;
v = rotrFixed(v, 2);
#else
v -= (flag << 2);
v >>= 2;
v |= (UInt32)1 << 30;
#endif
SetBe32a(p - 4, v)
}
}
return i;
}
Z7_BRANCH_FUNCS_IMP(SPARC)
Z7_BRANCH_FUNC_MAIN(ARMT)
{
// Byte *p = data;
Byte *lim;
size &= ~(SizeT)1;
// if (size == 0) return p;
if (size <= 2) return p;
size -= 2;
lim = p + size;
BR_PC_INIT
/* in ARM: branch offset is relative to the +2 instructions from current instruction.
(p) will point to the +2 instructions from current instruction */
// pc += 4 - 4;
// if (encoding) pc -= 0xf800 << 1; else pc += 0xf800 << 1;
// #define ARMT_TAIL_PROC { goto armt_tail; }
#define ARMT_TAIL_PROC { return p; }
do
{
/* in MSVC 32-bit x86 compilers:
UInt32 version : it loads value from memory with movzx
Byte version : it loads value to 8-bit register (AL/CL)
movzx version is slightly faster in some cpus
*/
unsigned b1;
// Byte / unsigned
b1 = p[1];
// optimized version to reduce one (p >= lim) check:
// unsigned a1 = p[1]; b1 = p[3]; p += 2; if Z7_LIKELY((b1 & (a1 ^ 8)) < 0xf8)
for (;;)
{
unsigned b3; // Byte / UInt32
/* (Byte)(b3) normalization can use low byte computations in MSVC.
It gives smaller code, and no loss of speed in some compilers/cpus.
But new MSVC 32-bit x86 compilers use more slow load
from memory to low byte register in that case.
So we try to use full 32-bit computations for faster code.
*/
// if (p >= lim) { ARMT_TAIL_PROC } b3 = b1 + 8; b1 = p[3]; p += 2; if ((b3 & b1) >= 0xf8) break;
if Z7_UNLIKELY(p >= lim) { ARMT_TAIL_PROC } b3 = p[3]; p += 2; if Z7_UNLIKELY((b3 & (b1 ^ 8)) >= 0xf8) break;
if Z7_UNLIKELY(p >= lim) { ARMT_TAIL_PROC } b1 = p[3]; p += 2; if Z7_UNLIKELY((b1 & (b3 ^ 8)) >= 0xf8) break;
}
{
/* we can adjust pc for (0xf800) to rid of (& 0x7FF) operation.
But gcc/clang for arm64 can use bfi instruction for full code here */
UInt32 v =
((UInt32)GetUi16a(p - 2) << 11) |
((UInt32)GetUi16a(p) & 0x7FF);
/*
UInt32 v =
((UInt32)p[1 - 2] << 19)
+ (((UInt32)p[1] & 0x7) << 8)
+ (((UInt32)p[-2] << 11))
+ (p[0]);
*/
p += 2;
{
UInt32 c = BR_PC_GET >> 1;
BR_CONVERT_VAL(v, c)
}
SetUi16a(p - 4, (UInt16)(((v >> 11) & 0x7ff) | 0xf000))
SetUi16a(p - 2, (UInt16)(v | 0xf800))
/*
p[-4] = (Byte)(v >> 11);
p[-3] = (Byte)(0xf0 | ((v >> 19) & 0x7));
p[-2] = (Byte)v;
p[-1] = (Byte)(0xf8 | (v >> 8));
*/
}
}
while (p < lim);
return p;
// armt_tail:
// if ((Byte)((lim[1] & 0xf8)) != 0xf0) { lim += 2; } return lim;
// return (Byte *)(lim + ((Byte)((lim[1] ^ 0xf0) & 0xf8) == 0 ? 0 : 2));
// return (Byte *)(lim + (((lim[1] ^ ~0xfu) & ~7u) == 0 ? 0 : 2));
// return (Byte *)(lim + 2 - (((((unsigned)lim[1] ^ 8) + 8) >> 7) & 2));
}
Z7_BRANCH_FUNCS_IMP(ARMT)
// #define BR_IA64_NO_INLINE
Z7_BRANCH_FUNC_MAIN(IA64)
{
// Byte *p = data;
const Byte *lim;
size &= ~(SizeT)15;
lim = p + size;
pc -= 1 << 4;
pc >>= 4 - 1;
// pc -= 1 << 1;
for (;;)
{
unsigned m;
for (;;)
{
if Z7_UNLIKELY(p == lim)
return p;
m = (unsigned)((UInt32)0x334b0000 >> (*p & 0x1e));
p += 16;
pc += 1 << 1;
if (m &= 3)
break;
}
{
p += (ptrdiff_t)m * 5 - 20; // negative value is expected here.
do
{
const UInt32 t =
#if defined(MY_CPU_X86_OR_AMD64)
// we use 32-bit load here to reduce code size on x86:
GetUi32(p);
#else
GetUi16(p);
#endif
UInt32 z = GetUi32(p + 1) >> m;
p += 5;
if (((t >> m) & (0x70 << 1)) == 0
&& ((z - (0x5000000 << 1)) & (0xf000000 << 1)) == 0)
{
UInt32 v = (UInt32)((0x8fffff << 1) | 1) & z;
z ^= v;
#ifdef BR_IA64_NO_INLINE
v |= (v & ((UInt32)1 << (23 + 1))) >> 3;
{
UInt32 c = pc;
BR_CONVERT_VAL(v, c)
}
v &= (0x1fffff << 1) | 1;
#else
{
if (encoding)
{
// pc &= ~(0xc00000 << 1); // we just need to clear at least 2 bits
pc &= (0x1fffff << 1) | 1;
v += pc;
}
else
{
// pc |= 0xc00000 << 1; // we need to set at least 2 bits
pc |= ~(UInt32)((0x1fffff << 1) | 1);
v -= pc;
}
}
v &= ~(UInt32)(0x600000 << 1);
#endif
v += (0x700000 << 1);
v &= (0x8fffff << 1) | 1;
z |= v;
z <<= m;
SetUi32(p + 1 - 5, z)
}
m++;
}
while (m &= 3); // while (m < 4);
}
}
}
Z7_BRANCH_FUNCS_IMP(IA64)

119
C/Bra.h
View File

@@ -1,60 +1,99 @@
/* Bra.h -- Branch converters for executables
2008-10-04 : Igor Pavlov : Public domain */
2023-04-02 : Igor Pavlov : Public domain */
#ifndef __BRA_H
#define __BRA_H
#ifndef ZIP7_INC_BRA_H
#define ZIP7_INC_BRA_H
#include "Types.h"
#include "7zTypes.h"
EXTERN_C_BEGIN
#define Z7_BRANCH_CONV_DEC(name) z7_BranchConv_ ## name ## _Dec
#define Z7_BRANCH_CONV_ENC(name) z7_BranchConv_ ## name ## _Enc
#define Z7_BRANCH_CONV_ST_DEC(name) z7_BranchConvSt_ ## name ## _Dec
#define Z7_BRANCH_CONV_ST_ENC(name) z7_BranchConvSt_ ## name ## _Enc
#define Z7_BRANCH_CONV_DECL(name) Byte * name(Byte *data, SizeT size, UInt32 pc)
#define Z7_BRANCH_CONV_ST_DECL(name) Byte * name(Byte *data, SizeT size, UInt32 pc, UInt32 *state)
typedef Z7_BRANCH_CONV_DECL( (*z7_Func_BranchConv));
typedef Z7_BRANCH_CONV_ST_DECL((*z7_Func_BranchConvSt));
#define Z7_BRANCH_CONV_ST_X86_STATE_INIT_VAL 0
Z7_BRANCH_CONV_ST_DECL(Z7_BRANCH_CONV_ST_DEC(X86));
Z7_BRANCH_CONV_ST_DECL(Z7_BRANCH_CONV_ST_ENC(X86));
#define Z7_BRANCH_FUNCS_DECL(name) \
Z7_BRANCH_CONV_DECL(Z7_BRANCH_CONV_DEC(name)); \
Z7_BRANCH_CONV_DECL(Z7_BRANCH_CONV_ENC(name));
Z7_BRANCH_FUNCS_DECL(ARM64)
Z7_BRANCH_FUNCS_DECL(ARM)
Z7_BRANCH_FUNCS_DECL(ARMT)
Z7_BRANCH_FUNCS_DECL(PPC)
Z7_BRANCH_FUNCS_DECL(SPARC)
Z7_BRANCH_FUNCS_DECL(IA64)
/*
These functions convert relative addresses to absolute addresses
in CALL instructions to increase the compression ratio.
In:
data - data buffer
size - size of data
ip - current virtual Instruction Pinter (IP) value
state - state variable for x86 converter
encoding - 0 (for decoding), 1 (for encoding)
Out:
state - state variable for x86 converter
These functions convert data that contain CPU instructions.
Each such function converts relative addresses to absolute addresses in some
branch instructions: CALL (in all converters) and JUMP (X86 converter only).
Such conversion allows to increase compression ratio, if we compress that data.
Returns:
The number of processed bytes. If you call these functions with multiple calls,
you must start next call with first byte after block of processed bytes.
There are 2 types of converters:
Byte * Conv_RISC (Byte *data, SizeT size, UInt32 pc);
Byte * ConvSt_X86(Byte *data, SizeT size, UInt32 pc, UInt32 *state);
Each Converter supports 2 versions: one for encoding
and one for decoding (_Enc/_Dec postfixes in function name).
In params:
data : data buffer
size : size of data
pc : current virtual Program Counter (Instruction Pinter) value
In/Out param:
state : pointer to state variable (for X86 converter only)
Return:
The pointer to position in (data) buffer after last byte that was processed.
If the caller calls converter again, it must call it starting with that position.
But the caller is allowed to move data in buffer. so pointer to
current processed position also will be changed for next call.
Also the caller must increase internal (pc) value for next call.
Each converter has some characteristics: Endian, Alignment, LookAhead.
Type Endian Alignment LookAhead
x86 little 1 4
X86 little 1 4
ARMT little 2 2
ARM little 4 0
ARM64 little 4 0
PPC big 4 0
SPARC big 4 0
IA64 little 16 0
size must be >= Alignment + LookAhead, if it's not last block.
If (size < Alignment + LookAhead), converter returns 0.
(data) must be aligned for (Alignment).
processed size can be calculated as:
SizeT processed = Conv(data, size, pc) - data;
if (processed == 0)
it means that converter needs more data for processing.
If (size < Alignment + LookAhead)
then (processed == 0) is allowed.
Example:
UInt32 ip = 0;
for ()
{
; size must be >= Alignment + LookAhead, if it's not last block
SizeT processed = Convert(data, size, ip, 1);
data += processed;
size -= processed;
ip += processed;
}
Example code for conversion in loop:
UInt32 pc = 0;
size = 0;
for (;;)
{
size += Load_more_input_data(data + size);
SizeT processed = Conv(data, size, pc) - data;
if (processed == 0 && no_more_input_data_after_size)
break; // we stop convert loop
data += processed;
size -= processed;
pc += processed;
}
*/
#define x86_Convert_Init(state) { state = 0; }
SizeT x86_Convert(Byte *data, SizeT size, UInt32 ip, UInt32 *state, int encoding);
SizeT ARM_Convert(Byte *data, SizeT size, UInt32 ip, int encoding);
SizeT ARMT_Convert(Byte *data, SizeT size, UInt32 ip, int encoding);
SizeT PPC_Convert(Byte *data, SizeT size, UInt32 ip, int encoding);
SizeT SPARC_Convert(Byte *data, SizeT size, UInt32 ip, int encoding);
SizeT IA64_Convert(Byte *data, SizeT size, UInt32 ip, int encoding);
EXTERN_C_END
#endif

236
C/Bra86.c
View File

@@ -1,85 +1,187 @@
/* Bra86.c -- Converter for x86 code (BCJ)
2008-10-04 : Igor Pavlov : Public domain */
/* Bra86.c -- Branch converter for X86 code (BCJ)
2023-04-02 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include "Bra.h"
#include "CpuArch.h"
#define Test86MSByte(b) ((b) == 0 || (b) == 0xFF)
const Byte kMaskToAllowedStatus[8] = {1, 1, 1, 0, 1, 0, 0, 0};
const Byte kMaskToBitNumber[8] = {0, 1, 2, 2, 3, 3, 3, 3};
#if defined(MY_CPU_SIZEOF_POINTER) \
&& ( MY_CPU_SIZEOF_POINTER == 4 \
|| MY_CPU_SIZEOF_POINTER == 8)
#define BR_CONV_USE_OPT_PC_PTR
#endif
SizeT x86_Convert(Byte *data, SizeT size, UInt32 ip, UInt32 *state, int encoding)
#ifdef BR_CONV_USE_OPT_PC_PTR
#define BR_PC_INIT pc -= (UInt32)(SizeT)p; // (MY_uintptr_t)
#define BR_PC_GET (pc + (UInt32)(SizeT)p)
#else
#define BR_PC_INIT pc += (UInt32)size;
#define BR_PC_GET (pc - (UInt32)(SizeT)(lim - p))
// #define BR_PC_INIT
// #define BR_PC_GET (pc + (UInt32)(SizeT)(p - data))
#endif
#define BR_CONVERT_VAL(v, c) if (encoding) v += c; else v -= c;
// #define BR_CONVERT_VAL(v, c) if (!encoding) c = (UInt32)0 - c; v += c;
#define Z7_BRANCH_CONV_ST(name) z7_BranchConvSt_ ## name
#define BR86_NEED_CONV_FOR_MS_BYTE(b) ((((b) + 1) & 0xfe) == 0)
#ifdef MY_CPU_LE_UNALIGN
#define BR86_PREPARE_BCJ_SCAN const UInt32 v = GetUi32(p) ^ 0xe8e8e8e8;
#define BR86_IS_BCJ_BYTE(n) ((v & ((UInt32)0xfe << (n) * 8)) == 0)
#else
#define BR86_PREPARE_BCJ_SCAN
// bad for MSVC X86 (partial write to byte reg):
#define BR86_IS_BCJ_BYTE(n) ((p[n - 4] & 0xfe) == 0xe8)
// bad for old MSVC (partial write to byte reg):
// #define BR86_IS_BCJ_BYTE(n) (((*p ^ 0xe8) & 0xfe) == 0)
#endif
static
Z7_FORCE_INLINE
Z7_ATTRIB_NO_VECTOR
Byte *Z7_BRANCH_CONV_ST(X86)(Byte *p, SizeT size, UInt32 pc, UInt32 *state, int encoding)
{
SizeT bufferPos = 0, prevPosT;
UInt32 prevMask = *state & 0x7;
if (size < 5)
return 0;
ip += 5;
prevPosT = (SizeT)0 - 1;
return p;
{
// Byte *p = data;
const Byte *lim = p + size - 4;
unsigned mask = (unsigned)*state; // & 7;
#ifdef BR_CONV_USE_OPT_PC_PTR
/* if BR_CONV_USE_OPT_PC_PTR is defined: we need to adjust (pc) for (+4),
because call/jump offset is relative to the next instruction.
if BR_CONV_USE_OPT_PC_PTR is not defined : we don't need to adjust (pc) for (+4),
because BR_PC_GET uses (pc - (lim - p)), and lim was adjusted for (-4) before.
*/
pc += 4;
#endif
BR_PC_INIT
goto start;
for (;;)
for (;; mask |= 4)
{
Byte *p = data + bufferPos;
Byte *limit = data + size - 4;
for (; p < limit; p++)
if ((*p & 0xFE) == 0xE8)
break;
bufferPos = (SizeT)(p - data);
if (p >= limit)
break;
prevPosT = bufferPos - prevPosT;
if (prevPosT > 3)
prevMask = 0;
else
// cont: mask |= 4;
start:
if (p >= lim)
goto fin;
{
prevMask = (prevMask << ((int)prevPosT - 1)) & 0x7;
if (prevMask != 0)
{
Byte b = p[4 - kMaskToBitNumber[prevMask]];
if (!kMaskToAllowedStatus[prevMask] || Test86MSByte(b))
{
prevPosT = bufferPos;
prevMask = ((prevMask << 1) & 0x7) | 1;
bufferPos++;
continue;
}
}
BR86_PREPARE_BCJ_SCAN
p += 4;
if (BR86_IS_BCJ_BYTE(0)) { goto m0; } mask >>= 1;
if (BR86_IS_BCJ_BYTE(1)) { goto m1; } mask >>= 1;
if (BR86_IS_BCJ_BYTE(2)) { goto m2; } mask = 0;
if (BR86_IS_BCJ_BYTE(3)) { goto a3; }
}
prevPosT = bufferPos;
goto main_loop;
if (Test86MSByte(p[4]))
m0: p--;
m1: p--;
m2: p--;
if (mask == 0)
goto a3;
if (p > lim)
goto fin_p;
// if (((0x17u >> mask) & 1) == 0)
if (mask > 4 || mask == 3)
{
UInt32 src = ((UInt32)p[4] << 24) | ((UInt32)p[3] << 16) | ((UInt32)p[2] << 8) | ((UInt32)p[1]);
UInt32 dest;
for (;;)
{
Byte b;
int index;
if (encoding)
dest = (ip + (UInt32)bufferPos) + src;
else
dest = src - (ip + (UInt32)bufferPos);
if (prevMask == 0)
break;
index = kMaskToBitNumber[prevMask] * 8;
b = (Byte)(dest >> (24 - index));
if (!Test86MSByte(b))
break;
src = dest ^ ((1 << (32 - index)) - 1);
}
p[4] = (Byte)(~(((dest >> 24) & 1) - 1));
p[3] = (Byte)(dest >> 16);
p[2] = (Byte)(dest >> 8);
p[1] = (Byte)dest;
bufferPos += 5;
mask >>= 1;
continue; // goto cont;
}
else
mask >>= 1;
if (BR86_NEED_CONV_FOR_MS_BYTE(p[mask]))
continue; // goto cont;
// if (!BR86_NEED_CONV_FOR_MS_BYTE(p[3])) continue; // goto cont;
{
prevMask = ((prevMask << 1) & 0x7) | 1;
bufferPos++;
UInt32 v = GetUi32(p);
UInt32 c;
v += (1 << 24); if (v & 0xfe000000) continue; // goto cont;
c = BR_PC_GET;
BR_CONVERT_VAL(v, c)
{
mask <<= 3;
if (BR86_NEED_CONV_FOR_MS_BYTE(v >> mask))
{
v ^= (((UInt32)0x100 << mask) - 1);
#ifdef MY_CPU_X86
// for X86 : we can recalculate (c) to reduce register pressure
c = BR_PC_GET;
#endif
BR_CONVERT_VAL(v, c)
}
mask = 0;
}
// v = (v & ((1 << 24) - 1)) - (v & (1 << 24));
v &= (1 << 25) - 1; v -= (1 << 24);
SetUi32(p, v)
p += 4;
goto main_loop;
}
main_loop:
if (p >= lim)
goto fin;
for (;;)
{
BR86_PREPARE_BCJ_SCAN
p += 4;
if (BR86_IS_BCJ_BYTE(0)) { goto a0; }
if (BR86_IS_BCJ_BYTE(1)) { goto a1; }
if (BR86_IS_BCJ_BYTE(2)) { goto a2; }
if (BR86_IS_BCJ_BYTE(3)) { goto a3; }
if (p >= lim)
goto fin;
}
a0: p--;
a1: p--;
a2: p--;
a3:
if (p > lim)
goto fin_p;
// if (!BR86_NEED_CONV_FOR_MS_BYTE(p[3])) continue; // goto cont;
{
UInt32 v = GetUi32(p);
UInt32 c;
v += (1 << 24); if (v & 0xfe000000) continue; // goto cont;
c = BR_PC_GET;
BR_CONVERT_VAL(v, c)
// v = (v & ((1 << 24) - 1)) - (v & (1 << 24));
v &= (1 << 25) - 1; v -= (1 << 24);
SetUi32(p, v)
p += 4;
goto main_loop;
}
}
prevPosT = bufferPos - prevPosT;
*state = ((prevPosT > 3) ? 0 : ((prevMask << ((int)prevPosT - 1)) & 0x7));
return bufferPos;
fin_p:
p--;
fin:
// the following processing for tail is optional and can be commented
/*
lim += 4;
for (; p < lim; p++, mask >>= 1)
if ((*p & 0xfe) == 0xe8)
break;
*/
*state = (UInt32)mask;
return p;
}
}
#define Z7_BRANCH_CONV_ST_FUNC_IMP(name, m, encoding) \
Z7_NO_INLINE \
Z7_ATTRIB_NO_VECTOR \
Byte *m(name)(Byte *data, SizeT size, UInt32 pc, UInt32 *state) \
{ return Z7_BRANCH_CONV_ST(name)(data, size, pc, state, encoding); }
Z7_BRANCH_CONV_ST_FUNC_IMP(X86, Z7_BRANCH_CONV_ST_DEC, 0)
#ifndef Z7_EXTRACT_ONLY
Z7_BRANCH_CONV_ST_FUNC_IMP(X86, Z7_BRANCH_CONV_ST_ENC, 1)
#endif

View File

@@ -1,67 +1,14 @@
/* BraIA64.c -- Converter for IA-64 code
2008-10-04 : Igor Pavlov : Public domain */
2023-02-20 : Igor Pavlov : Public domain */
#include "Bra.h"
#include "Precomp.h"
static const Byte kBranchTable[32] =
{
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
4, 4, 6, 6, 0, 0, 7, 7,
4, 4, 0, 0, 4, 4, 0, 0
};
// the code was moved to Bra.c
SizeT IA64_Convert(Byte *data, SizeT size, UInt32 ip, int encoding)
{
SizeT i;
if (size < 16)
return 0;
size -= 16;
for (i = 0; i <= size; i += 16)
{
UInt32 instrTemplate = data[i] & 0x1F;
UInt32 mask = kBranchTable[instrTemplate];
UInt32 bitPos = 5;
int slot;
for (slot = 0; slot < 3; slot++, bitPos += 41)
{
UInt32 bytePos, bitRes;
UInt64 instruction, instNorm;
int j;
if (((mask >> slot) & 1) == 0)
continue;
bytePos = (bitPos >> 3);
bitRes = bitPos & 0x7;
instruction = 0;
for (j = 0; j < 6; j++)
instruction += (UInt64)data[i + j + bytePos] << (8 * j);
#ifdef _MSC_VER
#pragma warning(disable : 4206) // nonstandard extension used : translation unit is empty
#endif
instNorm = instruction >> bitRes;
if (((instNorm >> 37) & 0xF) == 0x5 && ((instNorm >> 9) & 0x7) == 0)
{
UInt32 src = (UInt32)((instNorm >> 13) & 0xFFFFF);
UInt32 dest;
src |= ((UInt32)(instNorm >> 36) & 1) << 20;
src <<= 4;
if (encoding)
dest = ip + (UInt32)i + src;
else
dest = src - (ip + (UInt32)i);
dest >>= 4;
instNorm &= ~((UInt64)(0x8FFFFF) << 13);
instNorm |= ((UInt64)(dest & 0xFFFFF) << 13);
instNorm |= ((UInt64)(dest & 0x100000) << (36 - 20));
instruction &= (1 << bitRes) - 1;
instruction |= (instNorm << bitRes);
for (j = 0; j < 6; j++)
data[i + j + bytePos] = (Byte)(instruction >> (8 * j));
}
}
}
return i;
}
#if defined(__clang__)
#pragma GCC diagnostic ignored "-Wempty-translation-unit"
#endif

View File

@@ -1,15 +1,13 @@
/* BwtSort.c -- BWT block sorting
2008-08-17
Igor Pavlov
Public domain */
2023-04-02 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include "BwtSort.h"
#include "Sort.h"
/* #define BLOCK_SORT_USE_HEAP_SORT */
#define NO_INLINE MY_FAST_CALL
/* Don't change it !!! */
#define kNumHashBytes 2
#define kNumHashValues (1 << (kNumHashBytes * 8))
@@ -60,7 +58,10 @@ SortGroup - is recursive Range-Sort function with HeapSort optimization for smal
returns: 1 - if there are groups, 0 - no more groups
*/
UInt32 NO_INLINE SortGroup(UInt32 BlockSize, UInt32 NumSortedBytes, UInt32 groupOffset, UInt32 groupSize, int NumRefBits, UInt32 *Indices
static
UInt32
Z7_FASTCALL
SortGroup(UInt32 BlockSize, UInt32 NumSortedBytes, UInt32 groupOffset, UInt32 groupSize, int NumRefBits, UInt32 *Indices
#ifndef BLOCK_SORT_USE_HEAP_SORT
, UInt32 left, UInt32 range
#endif
@@ -72,7 +73,7 @@ UInt32 NO_INLINE SortGroup(UInt32 BlockSize, UInt32 NumSortedBytes, UInt32 group
{
/*
#ifndef BLOCK_SORT_EXTERNAL_FLAGS
SetFinishedGroupSize(ind2, 1);
SetFinishedGroupSize(ind2, 1)
#endif
*/
return 0;
@@ -116,7 +117,7 @@ UInt32 NO_INLINE SortGroup(UInt32 BlockSize, UInt32 NumSortedBytes, UInt32 group
}
HeapSort(temp, groupSize);
mask = ((1 << NumRefBits) - 1);
mask = (((UInt32)1 << NumRefBits) - 1);
thereAreGroups = 0;
group = groupOffset;
@@ -314,7 +315,7 @@ UInt32 NO_INLINE SortGroup(UInt32 BlockSize, UInt32 NumSortedBytes, UInt32 group
#ifndef BLOCK_SORT_EXTERNAL_FLAGS
UInt32 subGroupSize = ((ind2[j] & ~0xC0000000) >> kNumBitsMax);
if ((ind2[j] & 0x40000000) != 0)
subGroupSize += ((ind2[j + 1] >> kNumBitsMax) << kNumExtra0Bits);
subGroupSize += ((ind2[(size_t)j + 1] >> kNumBitsMax) << kNumExtra0Bits);
subGroupSize++;
for (;;)
{
@@ -362,7 +363,7 @@ UInt32 BlockSort(UInt32 *Indices, const Byte *data, UInt32 blockSize)
for (i = 0; i < kNumHashValues; i++)
counters[i] = 0;
for (i = 0; i < blockSize - 1; i++)
counters[((UInt32)data[i] << 8) | data[i + 1]]++;
counters[((UInt32)data[i] << 8) | data[(size_t)i + 1]]++;
counters[((UInt32)data[i] << 8) | data[0]]++;
Groups = counters + BS_TEMP_SIZE;
@@ -392,11 +393,11 @@ UInt32 BlockSort(UInt32 *Indices, const Byte *data, UInt32 blockSize)
}
for (i = 0; i < blockSize - 1; i++)
Groups[i] = counters[((UInt32)data[i] << 8) | data[i + 1]];
Groups[i] = counters[((UInt32)data[i] << 8) | data[(size_t)i + 1]];
Groups[i] = counters[((UInt32)data[i] << 8) | data[0]];
for (i = 0; i < blockSize - 1; i++)
Indices[counters[((UInt32)data[i] << 8) | data[i + 1]]++] = i;
Indices[counters[((UInt32)data[i] << 8) | data[(size_t)i + 1]]++] = i;
Indices[counters[((UInt32)data[i] << 8) | data[0]]++] = i;
#ifndef BLOCK_SORT_EXTERNAL_FLAGS
@@ -448,11 +449,11 @@ UInt32 BlockSort(UInt32 *Indices, const Byte *data, UInt32 blockSize)
groupSize = ((Indices[i] & ~0xC0000000) >> kNumBitsMax);
{
Bool finishedGroup = ((Indices[i] & 0x80000000) == 0);
BoolInt finishedGroup = ((Indices[i] & 0x80000000) == 0);
if ((Indices[i] & 0x40000000) != 0)
{
groupSize += ((Indices[i + 1] >> kNumBitsMax) << kNumExtra0Bits);
Indices[i + 1] &= kIndexMask;
groupSize += ((Indices[(size_t)i + 1] >> kNumBitsMax) << kNumExtra0Bits);
Indices[(size_t)i + 1] &= kIndexMask;
}
Indices[i] &= kIndexMask;
groupSize++;
@@ -460,10 +461,10 @@ UInt32 BlockSort(UInt32 *Indices, const Byte *data, UInt32 blockSize)
{
Indices[i - finishedGroupSize] &= kIndexMask;
if (finishedGroupSize > 1)
Indices[i - finishedGroupSize + 1] &= kIndexMask;
Indices[(size_t)(i - finishedGroupSize) + 1] &= kIndexMask;
{
UInt32 newGroupSize = groupSize + finishedGroupSize;
SetFinishedGroupSize(Indices + i - finishedGroupSize, newGroupSize);
SetFinishedGroupSize(Indices + i - finishedGroupSize, newGroupSize)
finishedGroupSize = newGroupSize;
}
i += groupSize;
@@ -503,8 +504,8 @@ UInt32 BlockSort(UInt32 *Indices, const Byte *data, UInt32 blockSize)
UInt32 groupSize = ((Indices[i] & ~0xC0000000) >> kNumBitsMax);
if ((Indices[i] & 0x40000000) != 0)
{
groupSize += ((Indices[i + 1] >> kNumBitsMax) << kNumExtra0Bits);
Indices[i + 1] &= kIndexMask;
groupSize += ((Indices[(size_t)i + 1] >> kNumBitsMax) << kNumExtra0Bits);
Indices[(size_t)i + 1] &= kIndexMask;
}
Indices[i] &= kIndexMask;
groupSize++;
@@ -513,4 +514,3 @@ UInt32 BlockSort(UInt32 *Indices, const Byte *data, UInt32 blockSize)
#endif
return Groups[0];
}

View File

@@ -1,12 +1,12 @@
/* BwtSort.h -- BWT block sorting
2008-03-26
Igor Pavlov
Public domain */
2023-03-03 : Igor Pavlov : Public domain */
#ifndef __BWTSORT_H
#define __BWTSORT_H
#ifndef ZIP7_INC_BWT_SORT_H
#define ZIP7_INC_BWT_SORT_H
#include "Types.h"
#include "7zTypes.h"
EXTERN_C_BEGIN
/* use BLOCK_SORT_EXTERNAL_FLAGS if blockSize can be > 1M */
/* #define BLOCK_SORT_EXTERNAL_FLAGS */
@@ -21,4 +21,6 @@ Public domain */
UInt32 BlockSort(UInt32 *indices, const Byte *data, UInt32 blockSize);
EXTERN_C_END
#endif

161
C/Compiler.h Executable file
View File

@@ -0,0 +1,161 @@
/* Compiler.h : Compiler specific defines and pragmas
2023-04-02 : Igor Pavlov : Public domain */
#ifndef ZIP7_INC_COMPILER_H
#define ZIP7_INC_COMPILER_H
#if defined(__clang__)
# define Z7_CLANG_VERSION (__clang_major__ * 10000 + __clang_minor__ * 100 + __clang_patchlevel__)
#endif
#if defined(__clang__) && defined(__apple_build_version__)
# define Z7_APPLE_CLANG_VERSION Z7_CLANG_VERSION
#elif defined(__clang__)
# define Z7_LLVM_CLANG_VERSION Z7_CLANG_VERSION
#elif defined(__GNUC__)
# define Z7_GCC_VERSION (__GNUC__ * 10000 + __GNUC_MINOR__ * 100 + __GNUC_PATCHLEVEL__)
#endif
#ifdef _MSC_VER
#if !defined(__clang__) && !defined(__GNUC__)
#define Z7_MSC_VER_ORIGINAL _MSC_VER
#endif
#endif
#if defined(__MINGW32__) || defined(__MINGW64__)
#define Z7_MINGW
#endif
// #pragma GCC diagnostic ignored "-Wunknown-pragmas"
#ifdef __clang__
// padding size of '' with 4 bytes to alignment boundary
#pragma GCC diagnostic ignored "-Wpadded"
#endif
#ifdef _MSC_VER
#ifdef UNDER_CE
#define RPC_NO_WINDOWS_H
/* #pragma warning(disable : 4115) // '_RPC_ASYNC_STATE' : named type definition in parentheses */
#pragma warning(disable : 4201) // nonstandard extension used : nameless struct/union
#pragma warning(disable : 4214) // nonstandard extension used : bit field types other than int
#endif
#if defined(_MSC_VER) && _MSC_VER >= 1800
#pragma warning(disable : 4464) // relative include path contains '..'
#endif
// == 1200 : -O1 : for __forceinline
// >= 1900 : -O1 : for printf
#pragma warning(disable : 4710) // function not inlined
#if _MSC_VER < 1900
// winnt.h: 'Int64ShllMod32'
#pragma warning(disable : 4514) // unreferenced inline function has been removed
#endif
#if _MSC_VER < 1300
// #pragma warning(disable : 4702) // unreachable code
// Bra.c : -O1:
#pragma warning(disable : 4714) // function marked as __forceinline not inlined
#endif
/*
#if _MSC_VER > 1400 && _MSC_VER <= 1900
// strcat: This function or variable may be unsafe
// sysinfoapi.h: kit10: GetVersion was declared deprecated
#pragma warning(disable : 4996)
#endif
*/
#pragma warning(disable : 4255)
#if _MSC_VER > 1200
// -Wall warnings
#pragma warning(disable : 4711) // function selected for automatic inline expansion
#pragma warning(disable : 4820) // '2' bytes padding added after data member
#if _MSC_VER >= 1400 && _MSC_VER < 1920
// 1400: string.h: _DBG_MEMCPY_INLINE_
// 1600 - 191x : smmintrin.h __cplusplus'
// is not defined as a preprocessor macro, replacing with '0' for '#if/#elif'
#pragma warning(disable : 4668)
// 1400 - 1600 : WinDef.h : 'FARPROC' :
// 1900 - 191x : immintrin.h: _readfsbase_u32
// no function prototype given : converting '()' to '(void)'
#pragma warning(disable : 4255)
#endif
#if _MSC_VER >= 1914
// Compiler will insert Spectre mitigation for memory load if /Qspectre switch specified
#pragma warning(disable : 5045)
#endif
#endif // _MSC_VER > 1200
#endif // _MSC_VER
#if defined(__clang__) && (__clang_major__ >= 4)
#define Z7_PRAGMA_OPT_DISABLE_LOOP_UNROLL_VECTORIZE \
_Pragma("clang loop unroll(disable)") \
_Pragma("clang loop vectorize(disable)")
#define Z7_ATTRIB_NO_VECTORIZE
#elif defined(__GNUC__) && (__GNUC__ >= 5)
#define Z7_ATTRIB_NO_VECTORIZE __attribute__((optimize("no-tree-vectorize")))
// __attribute__((optimize("no-unroll-loops")));
#define Z7_PRAGMA_OPT_DISABLE_LOOP_UNROLL_VECTORIZE
#elif defined(_MSC_VER) && (_MSC_VER >= 1920)
#define Z7_PRAGMA_OPT_DISABLE_LOOP_UNROLL_VECTORIZE \
_Pragma("loop( no_vector )")
#define Z7_ATTRIB_NO_VECTORIZE
#else
#define Z7_PRAGMA_OPT_DISABLE_LOOP_UNROLL_VECTORIZE
#define Z7_ATTRIB_NO_VECTORIZE
#endif
#if defined(MY_CPU_X86_OR_AMD64) && ( \
defined(__clang__) && (__clang_major__ >= 4) \
|| defined(__GNUC__) && (__GNUC__ >= 5))
#define Z7_ATTRIB_NO_SSE __attribute__((__target__("no-sse")))
#else
#define Z7_ATTRIB_NO_SSE
#endif
#define Z7_ATTRIB_NO_VECTOR \
Z7_ATTRIB_NO_VECTORIZE \
Z7_ATTRIB_NO_SSE
#if defined(__clang__) && (__clang_major__ >= 8) \
|| defined(__GNUC__) && (__GNUC__ >= 1000) \
/* || defined(_MSC_VER) && (_MSC_VER >= 1920) */
// GCC is not good for __builtin_expect()
#define Z7_LIKELY(x) (__builtin_expect((x), 1))
#define Z7_UNLIKELY(x) (__builtin_expect((x), 0))
// #define Z7_unlikely [[unlikely]]
// #define Z7_likely [[likely]]
#else
#define Z7_LIKELY(x) (x)
#define Z7_UNLIKELY(x) (x)
// #define Z7_likely
#endif
#if (defined(Z7_CLANG_VERSION) && (Z7_CLANG_VERSION >= 36000))
#define Z7_DIAGNOSCTIC_IGNORE_BEGIN_RESERVED_MACRO_IDENTIFIER \
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wreserved-macro-identifier\"")
#define Z7_DIAGNOSCTIC_IGNORE_END_RESERVED_MACRO_IDENTIFIER \
_Pragma("GCC diagnostic pop")
#else
#define Z7_DIAGNOSCTIC_IGNORE_BEGIN_RESERVED_MACRO_IDENTIFIER
#define Z7_DIAGNOSCTIC_IGNORE_END_RESERVED_MACRO_IDENTIFIER
#endif
#define UNUSED_VAR(x) (void)x;
/* #define UNUSED_VAR(x) x=x; */
#endif

823
C/CpuArch.c Executable file
View File

@@ -0,0 +1,823 @@
/* CpuArch.c -- CPU specific code
2023-05-18 : Igor Pavlov : Public domain */
#include "Precomp.h"
// #include <stdio.h>
#include "CpuArch.h"
#ifdef MY_CPU_X86_OR_AMD64
#undef NEED_CHECK_FOR_CPUID
#if !defined(MY_CPU_AMD64)
#define NEED_CHECK_FOR_CPUID
#endif
/*
cpuid instruction supports (subFunction) parameter in ECX,
that is used only with some specific (function) parameter values.
But we always use only (subFunction==0).
*/
/*
__cpuid(): MSVC and GCC/CLANG use same function/macro name
but parameters are different.
We use MSVC __cpuid() parameters style for our z7_x86_cpuid() function.
*/
#if defined(__GNUC__) /* && (__GNUC__ >= 10) */ \
|| defined(__clang__) /* && (__clang_major__ >= 10) */
/* there was some CLANG/GCC compilers that have issues with
rbx(ebx) handling in asm blocks in -fPIC mode (__PIC__ is defined).
compiler's <cpuid.h> contains the macro __cpuid() that is similar to our code.
The history of __cpuid() changes in CLANG/GCC:
GCC:
2007: it preserved ebx for (__PIC__ && __i386__)
2013: it preserved rbx and ebx for __PIC__
2014: it doesn't preserves rbx and ebx anymore
we suppose that (__GNUC__ >= 5) fixed that __PIC__ ebx/rbx problem.
CLANG:
2014+: it preserves rbx, but only for 64-bit code. No __PIC__ check.
Why CLANG cares about 64-bit mode only, and doesn't care about ebx (in 32-bit)?
Do we need __PIC__ test for CLANG or we must care about rbx even if
__PIC__ is not defined?
*/
#define ASM_LN "\n"
#if defined(MY_CPU_AMD64) && defined(__PIC__) \
&& ((defined (__GNUC__) && (__GNUC__ < 5)) || defined(__clang__))
#define x86_cpuid_MACRO(p, func) { \
__asm__ __volatile__ ( \
ASM_LN "mov %%rbx, %q1" \
ASM_LN "cpuid" \
ASM_LN "xchg %%rbx, %q1" \
: "=a" ((p)[0]), "=&r" ((p)[1]), "=c" ((p)[2]), "=d" ((p)[3]) : "0" (func), "2"(0)); }
/* "=&r" selects free register. It can select even rbx, if that register is free.
"=&D" for (RDI) also works, but the code can be larger with "=&D"
"2"(0) means (subFunction = 0),
2 is (zero-based) index in the output constraint list "=c" (ECX). */
#elif defined(MY_CPU_X86) && defined(__PIC__) \
&& ((defined (__GNUC__) && (__GNUC__ < 5)) || defined(__clang__))
#define x86_cpuid_MACRO(p, func) { \
__asm__ __volatile__ ( \
ASM_LN "mov %%ebx, %k1" \
ASM_LN "cpuid" \
ASM_LN "xchg %%ebx, %k1" \
: "=a" ((p)[0]), "=&r" ((p)[1]), "=c" ((p)[2]), "=d" ((p)[3]) : "0" (func), "2"(0)); }
#else
#define x86_cpuid_MACRO(p, func) { \
__asm__ __volatile__ ( \
ASM_LN "cpuid" \
: "=a" ((p)[0]), "=b" ((p)[1]), "=c" ((p)[2]), "=d" ((p)[3]) : "0" (func), "2"(0)); }
#endif
void Z7_FASTCALL z7_x86_cpuid(UInt32 p[4], UInt32 func)
{
x86_cpuid_MACRO(p, func)
}
Z7_NO_INLINE
UInt32 Z7_FASTCALL z7_x86_cpuid_GetMaxFunc(void)
{
#if defined(NEED_CHECK_FOR_CPUID)
#define EFALGS_CPUID_BIT 21
UInt32 a;
__asm__ __volatile__ (
ASM_LN "pushf"
ASM_LN "pushf"
ASM_LN "pop %0"
// ASM_LN "movl %0, %1"
// ASM_LN "xorl $0x200000, %0"
ASM_LN "btc %1, %0"
ASM_LN "push %0"
ASM_LN "popf"
ASM_LN "pushf"
ASM_LN "pop %0"
ASM_LN "xorl (%%esp), %0"
ASM_LN "popf"
ASM_LN
: "=&r" (a) // "=a"
: "i" (EFALGS_CPUID_BIT)
);
if ((a & (1 << EFALGS_CPUID_BIT)) == 0)
return 0;
#endif
{
UInt32 p[4];
x86_cpuid_MACRO(p, 0)
return p[0];
}
}
#undef ASM_LN
#elif !defined(_MSC_VER)
/*
// for gcc/clang and other: we can try to use __cpuid macro:
#include <cpuid.h>
void Z7_FASTCALL z7_x86_cpuid(UInt32 p[4], UInt32 func)
{
__cpuid(func, p[0], p[1], p[2], p[3]);
}
UInt32 Z7_FASTCALL z7_x86_cpuid_GetMaxFunc(void)
{
return (UInt32)__get_cpuid_max(0, NULL);
}
*/
// for unsupported cpuid:
void Z7_FASTCALL z7_x86_cpuid(UInt32 p[4], UInt32 func)
{
UNUSED_VAR(func)
p[0] = p[1] = p[2] = p[3] = 0;
}
UInt32 Z7_FASTCALL z7_x86_cpuid_GetMaxFunc(void)
{
return 0;
}
#else // _MSC_VER
#if !defined(MY_CPU_AMD64)
UInt32 __declspec(naked) Z7_FASTCALL z7_x86_cpuid_GetMaxFunc(void)
{
#if defined(NEED_CHECK_FOR_CPUID)
#define EFALGS_CPUID_BIT 21
__asm pushfd
__asm pushfd
/*
__asm pop eax
// __asm mov edx, eax
__asm btc eax, EFALGS_CPUID_BIT
__asm push eax
*/
__asm btc dword ptr [esp], EFALGS_CPUID_BIT
__asm popfd
__asm pushfd
__asm pop eax
// __asm xor eax, edx
__asm xor eax, [esp]
// __asm push edx
__asm popfd
__asm and eax, (1 shl EFALGS_CPUID_BIT)
__asm jz end_func
#endif
__asm push ebx
__asm xor eax, eax // func
__asm xor ecx, ecx // subFunction (optional) for (func == 0)
__asm cpuid
__asm pop ebx
#if defined(NEED_CHECK_FOR_CPUID)
end_func:
#endif
__asm ret 0
}
void __declspec(naked) Z7_FASTCALL z7_x86_cpuid(UInt32 p[4], UInt32 func)
{
UNUSED_VAR(p)
UNUSED_VAR(func)
__asm push ebx
__asm push edi
__asm mov edi, ecx // p
__asm mov eax, edx // func
__asm xor ecx, ecx // subfunction (optional) for (func == 0)
__asm cpuid
__asm mov [edi ], eax
__asm mov [edi + 4], ebx
__asm mov [edi + 8], ecx
__asm mov [edi + 12], edx
__asm pop edi
__asm pop ebx
__asm ret 0
}
#else // MY_CPU_AMD64
#if _MSC_VER >= 1600
#include <intrin.h>
#define MY_cpuidex __cpuidex
#else
/*
__cpuid (func == (0 or 7)) requires subfunction number in ECX.
MSDN: The __cpuid intrinsic clears the ECX register before calling the cpuid instruction.
__cpuid() in new MSVC clears ECX.
__cpuid() in old MSVC (14.00) x64 doesn't clear ECX
We still can use __cpuid for low (func) values that don't require ECX,
but __cpuid() in old MSVC will be incorrect for some func values: (func == 7).
So here we use the hack for old MSVC to send (subFunction) in ECX register to cpuid instruction,
where ECX value is first parameter for FASTCALL / NO_INLINE func,
So the caller of MY_cpuidex_HACK() sets ECX as subFunction, and
old MSVC for __cpuid() doesn't change ECX and cpuid instruction gets (subFunction) value.
DON'T remove Z7_NO_INLINE and Z7_FASTCALL for MY_cpuidex_HACK(): !!!
*/
static
Z7_NO_INLINE void Z7_FASTCALL MY_cpuidex_HACK(UInt32 subFunction, UInt32 func, int *CPUInfo)
{
UNUSED_VAR(subFunction)
__cpuid(CPUInfo, func);
}
#define MY_cpuidex(info, func, func2) MY_cpuidex_HACK(func2, func, info)
#pragma message("======== MY_cpuidex_HACK WAS USED ========")
#endif // _MSC_VER >= 1600
#if !defined(MY_CPU_AMD64)
/* inlining for __cpuid() in MSVC x86 (32-bit) produces big ineffective code,
so we disable inlining here */
Z7_NO_INLINE
#endif
void Z7_FASTCALL z7_x86_cpuid(UInt32 p[4], UInt32 func)
{
MY_cpuidex((int *)p, (int)func, 0);
}
Z7_NO_INLINE
UInt32 Z7_FASTCALL z7_x86_cpuid_GetMaxFunc(void)
{
int a[4];
MY_cpuidex(a, 0, 0);
return a[0];
}
#endif // MY_CPU_AMD64
#endif // _MSC_VER
#if defined(NEED_CHECK_FOR_CPUID)
#define CHECK_CPUID_IS_SUPPORTED { if (z7_x86_cpuid_GetMaxFunc() == 0) return 0; }
#else
#define CHECK_CPUID_IS_SUPPORTED
#endif
#undef NEED_CHECK_FOR_CPUID
static
BoolInt x86cpuid_Func_1(UInt32 *p)
{
CHECK_CPUID_IS_SUPPORTED
z7_x86_cpuid(p, 1);
return True;
}
/*
static const UInt32 kVendors[][1] =
{
{ 0x756E6547 }, // , 0x49656E69, 0x6C65746E },
{ 0x68747541 }, // , 0x69746E65, 0x444D4163 },
{ 0x746E6543 } // , 0x48727561, 0x736C7561 }
};
*/
/*
typedef struct
{
UInt32 maxFunc;
UInt32 vendor[3];
UInt32 ver;
UInt32 b;
UInt32 c;
UInt32 d;
} Cx86cpuid;
enum
{
CPU_FIRM_INTEL,
CPU_FIRM_AMD,
CPU_FIRM_VIA
};
int x86cpuid_GetFirm(const Cx86cpuid *p);
#define x86cpuid_ver_GetFamily(ver) (((ver >> 16) & 0xff0) | ((ver >> 8) & 0xf))
#define x86cpuid_ver_GetModel(ver) (((ver >> 12) & 0xf0) | ((ver >> 4) & 0xf))
#define x86cpuid_ver_GetStepping(ver) (ver & 0xf)
int x86cpuid_GetFirm(const Cx86cpuid *p)
{
unsigned i;
for (i = 0; i < sizeof(kVendors) / sizeof(kVendors[0]); i++)
{
const UInt32 *v = kVendors[i];
if (v[0] == p->vendor[0]
// && v[1] == p->vendor[1]
// && v[2] == p->vendor[2]
)
return (int)i;
}
return -1;
}
BoolInt CPU_Is_InOrder()
{
Cx86cpuid p;
UInt32 family, model;
if (!x86cpuid_CheckAndRead(&p))
return True;
family = x86cpuid_ver_GetFamily(p.ver);
model = x86cpuid_ver_GetModel(p.ver);
switch (x86cpuid_GetFirm(&p))
{
case CPU_FIRM_INTEL: return (family < 6 || (family == 6 && (
// In-Order Atom CPU
model == 0x1C // 45 nm, N4xx, D4xx, N5xx, D5xx, 230, 330
|| model == 0x26 // 45 nm, Z6xx
|| model == 0x27 // 32 nm, Z2460
|| model == 0x35 // 32 nm, Z2760
|| model == 0x36 // 32 nm, N2xxx, D2xxx
)));
case CPU_FIRM_AMD: return (family < 5 || (family == 5 && (model < 6 || model == 0xA)));
case CPU_FIRM_VIA: return (family < 6 || (family == 6 && model < 0xF));
}
return False; // v23 : unknown processors are not In-Order
}
*/
#ifdef _WIN32
#include "7zWindows.h"
#endif
#if !defined(MY_CPU_AMD64) && defined(_WIN32)
/* for legacy SSE ia32: there is no user-space cpu instruction to check
that OS supports SSE register storing/restoring on context switches.
So we need some OS-specific function to check that it's safe to use SSE registers.
*/
Z7_FORCE_INLINE
static BoolInt CPU_Sys_Is_SSE_Supported(void)
{
#ifdef _MSC_VER
#pragma warning(push)
#pragma warning(disable : 4996) // `GetVersion': was declared deprecated
#endif
/* low byte is major version of Windows
We suppose that any Windows version since
Windows2000 (major == 5) supports SSE registers */
return (Byte)GetVersion() >= 5;
#if defined(_MSC_VER)
#pragma warning(pop)
#endif
}
#define CHECK_SYS_SSE_SUPPORT if (!CPU_Sys_Is_SSE_Supported()) return False;
#else
#define CHECK_SYS_SSE_SUPPORT
#endif
#if !defined(MY_CPU_AMD64)
BoolInt CPU_IsSupported_CMOV(void)
{
UInt32 a[4];
if (!x86cpuid_Func_1(&a[0]))
return 0;
return (a[3] >> 15) & 1;
}
BoolInt CPU_IsSupported_SSE(void)
{
UInt32 a[4];
CHECK_SYS_SSE_SUPPORT
if (!x86cpuid_Func_1(&a[0]))
return 0;
return (a[3] >> 25) & 1;
}
BoolInt CPU_IsSupported_SSE2(void)
{
UInt32 a[4];
CHECK_SYS_SSE_SUPPORT
if (!x86cpuid_Func_1(&a[0]))
return 0;
return (a[3] >> 26) & 1;
}
#endif
static UInt32 x86cpuid_Func_1_ECX(void)
{
UInt32 a[4];
CHECK_SYS_SSE_SUPPORT
if (!x86cpuid_Func_1(&a[0]))
return 0;
return a[2];
}
BoolInt CPU_IsSupported_AES(void)
{
return (x86cpuid_Func_1_ECX() >> 25) & 1;
}
BoolInt CPU_IsSupported_SSSE3(void)
{
return (x86cpuid_Func_1_ECX() >> 9) & 1;
}
BoolInt CPU_IsSupported_SSE41(void)
{
return (x86cpuid_Func_1_ECX() >> 19) & 1;
}
BoolInt CPU_IsSupported_SHA(void)
{
CHECK_SYS_SSE_SUPPORT
if (z7_x86_cpuid_GetMaxFunc() < 7)
return False;
{
UInt32 d[4];
z7_x86_cpuid(d, 7);
return (d[1] >> 29) & 1;
}
}
/*
MSVC: _xgetbv() intrinsic is available since VS2010SP1.
MSVC also defines (_XCR_XFEATURE_ENABLED_MASK) macro in
<immintrin.h> that we can use or check.
For any 32-bit x86 we can use asm code in MSVC,
but MSVC asm code is huge after compilation.
So _xgetbv() is better
ICC: _xgetbv() intrinsic is available (in what version of ICC?)
ICC defines (__GNUC___) and it supports gnu assembler
also ICC supports MASM style code with -use-msasm switch.
but ICC doesn't support __attribute__((__target__))
GCC/CLANG 9:
_xgetbv() is macro that works via __builtin_ia32_xgetbv()
and we need __attribute__((__target__("xsave")).
But with __target__("xsave") the function will be not
inlined to function that has no __target__("xsave") attribute.
If we want _xgetbv() call inlining, then we should use asm version
instead of calling _xgetbv().
Note:intrinsic is broke before GCC 8.2:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85684
*/
#if defined(__INTEL_COMPILER) && (__INTEL_COMPILER >= 1100) \
|| defined(_MSC_VER) && (_MSC_VER >= 1600) && (_MSC_FULL_VER >= 160040219) \
|| defined(__GNUC__) && (__GNUC__ >= 9) \
|| defined(__clang__) && (__clang_major__ >= 9)
// we define ATTRIB_XGETBV, if we want to use predefined _xgetbv() from compiler
#if defined(__INTEL_COMPILER)
#define ATTRIB_XGETBV
#elif defined(__GNUC__) || defined(__clang__)
// we don't define ATTRIB_XGETBV here, because asm version is better for inlining.
// #define ATTRIB_XGETBV __attribute__((__target__("xsave")))
#else
#define ATTRIB_XGETBV
#endif
#endif
#if defined(ATTRIB_XGETBV)
#include <immintrin.h>
#endif
// XFEATURE_ENABLED_MASK/XCR0
#define MY_XCR_XFEATURE_ENABLED_MASK 0
#if defined(ATTRIB_XGETBV)
ATTRIB_XGETBV
#endif
static UInt64 x86_xgetbv_0(UInt32 num)
{
#if defined(ATTRIB_XGETBV)
{
return
#if (defined(_MSC_VER))
_xgetbv(num);
#else
__builtin_ia32_xgetbv(
#if !defined(__clang__)
(int)
#endif
num);
#endif
}
#elif defined(__GNUC__) || defined(__clang__) || defined(__SUNPRO_CC)
UInt32 a, d;
#if defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4))
__asm__
(
"xgetbv"
: "=a"(a), "=d"(d) : "c"(num) : "cc"
);
#else // is old gcc
__asm__
(
".byte 0x0f, 0x01, 0xd0" "\n\t"
: "=a"(a), "=d"(d) : "c"(num) : "cc"
);
#endif
return ((UInt64)d << 32) | a;
// return a;
#elif defined(_MSC_VER) && !defined(MY_CPU_AMD64)
UInt32 a, d;
__asm {
push eax
push edx
push ecx
mov ecx, num;
// xor ecx, ecx // = MY_XCR_XFEATURE_ENABLED_MASK
_emit 0x0f
_emit 0x01
_emit 0xd0
mov a, eax
mov d, edx
pop ecx
pop edx
pop eax
}
return ((UInt64)d << 32) | a;
// return a;
#else // it's unknown compiler
// #error "Need xgetbv function"
UNUSED_VAR(num)
// for MSVC-X64 we could call external function from external file.
/* Actually we had checked OSXSAVE/AVX in cpuid before.
So it's expected that OS supports at least AVX and below. */
// if (num != MY_XCR_XFEATURE_ENABLED_MASK) return 0; // if not XCR0
return
// (1 << 0) | // x87
(1 << 1) // SSE
| (1 << 2); // AVX
#endif
}
#ifdef _WIN32
/*
Windows versions do not know about new ISA extensions that
can be introduced. But we still can use new extensions,
even if Windows doesn't report about supporting them,
But we can use new extensions, only if Windows knows about new ISA extension
that changes the number or size of registers: SSE, AVX/XSAVE, AVX512
So it's enough to check
MY_PF_AVX_INSTRUCTIONS_AVAILABLE
instead of
MY_PF_AVX2_INSTRUCTIONS_AVAILABLE
*/
#define MY_PF_XSAVE_ENABLED 17
// #define MY_PF_SSSE3_INSTRUCTIONS_AVAILABLE 36
// #define MY_PF_SSE4_1_INSTRUCTIONS_AVAILABLE 37
// #define MY_PF_SSE4_2_INSTRUCTIONS_AVAILABLE 38
// #define MY_PF_AVX_INSTRUCTIONS_AVAILABLE 39
// #define MY_PF_AVX2_INSTRUCTIONS_AVAILABLE 40
// #define MY_PF_AVX512F_INSTRUCTIONS_AVAILABLE 41
#endif
BoolInt CPU_IsSupported_AVX(void)
{
#ifdef _WIN32
if (!IsProcessorFeaturePresent(MY_PF_XSAVE_ENABLED))
return False;
/* PF_AVX_INSTRUCTIONS_AVAILABLE probably is supported starting from
some latest Win10 revisions. But we need AVX in older Windows also.
So we don't use the following check: */
/*
if (!IsProcessorFeaturePresent(MY_PF_AVX_INSTRUCTIONS_AVAILABLE))
return False;
*/
#endif
/*
OS must use new special XSAVE/XRSTOR instructions to save
AVX registers when it required for context switching.
At OS statring:
OS sets CR4.OSXSAVE flag to signal the processor that OS supports the XSAVE extensions.
Also OS sets bitmask in XCR0 register that defines what
registers will be processed by XSAVE instruction:
XCR0.SSE[bit 0] - x87 registers and state
XCR0.SSE[bit 1] - SSE registers and state
XCR0.AVX[bit 2] - AVX registers and state
CR4.OSXSAVE is reflected to CPUID.1:ECX.OSXSAVE[bit 27].
So we can read that bit in user-space.
XCR0 is available for reading in user-space by new XGETBV instruction.
*/
{
const UInt32 c = x86cpuid_Func_1_ECX();
if (0 == (1
& (c >> 28) // AVX instructions are supported by hardware
& (c >> 27))) // OSXSAVE bit: XSAVE and related instructions are enabled by OS.
return False;
}
/* also we can check
CPUID.1:ECX.XSAVE [bit 26] : that shows that
XSAVE, XRESTOR, XSETBV, XGETBV instructions are supported by hardware.
But that check is redundant, because if OSXSAVE bit is set, then XSAVE is also set */
/* If OS have enabled XSAVE extension instructions (OSXSAVE == 1),
in most cases we expect that OS also will support storing/restoring
for AVX and SSE states at least.
But to be ensure for that we call user-space instruction
XGETBV(0) to get XCR0 value that contains bitmask that defines
what exact states(registers) OS have enabled for storing/restoring.
*/
{
const UInt32 bm = (UInt32)x86_xgetbv_0(MY_XCR_XFEATURE_ENABLED_MASK);
// printf("\n=== XGetBV=%d\n", bm);
return 1
& (bm >> 1) // SSE state is supported (set by OS) for storing/restoring
& (bm >> 2); // AVX state is supported (set by OS) for storing/restoring
}
// since Win7SP1: we can use GetEnabledXStateFeatures();
}
BoolInt CPU_IsSupported_AVX2(void)
{
if (!CPU_IsSupported_AVX())
return False;
if (z7_x86_cpuid_GetMaxFunc() < 7)
return False;
{
UInt32 d[4];
z7_x86_cpuid(d, 7);
// printf("\ncpuid(7): ebx=%8x ecx=%8x\n", d[1], d[2]);
return 1
& (d[1] >> 5); // avx2
}
}
BoolInt CPU_IsSupported_VAES_AVX2(void)
{
if (!CPU_IsSupported_AVX())
return False;
if (z7_x86_cpuid_GetMaxFunc() < 7)
return False;
{
UInt32 d[4];
z7_x86_cpuid(d, 7);
// printf("\ncpuid(7): ebx=%8x ecx=%8x\n", d[1], d[2]);
return 1
& (d[1] >> 5) // avx2
// & (d[1] >> 31) // avx512vl
& (d[2] >> 9); // vaes // VEX-256/EVEX
}
}
BoolInt CPU_IsSupported_PageGB(void)
{
CHECK_CPUID_IS_SUPPORTED
{
UInt32 d[4];
z7_x86_cpuid(d, 0x80000000);
if (d[0] < 0x80000001)
return False;
z7_x86_cpuid(d, 0x80000001);
return (d[3] >> 26) & 1;
}
}
#elif defined(MY_CPU_ARM_OR_ARM64)
#ifdef _WIN32
#include "7zWindows.h"
BoolInt CPU_IsSupported_CRC32(void) { return IsProcessorFeaturePresent(PF_ARM_V8_CRC32_INSTRUCTIONS_AVAILABLE) ? 1 : 0; }
BoolInt CPU_IsSupported_CRYPTO(void) { return IsProcessorFeaturePresent(PF_ARM_V8_CRYPTO_INSTRUCTIONS_AVAILABLE) ? 1 : 0; }
BoolInt CPU_IsSupported_NEON(void) { return IsProcessorFeaturePresent(PF_ARM_NEON_INSTRUCTIONS_AVAILABLE) ? 1 : 0; }
#else
#if defined(__APPLE__)
/*
#include <stdio.h>
#include <string.h>
static void Print_sysctlbyname(const char *name)
{
size_t bufSize = 256;
char buf[256];
int res = sysctlbyname(name, &buf, &bufSize, NULL, 0);
{
int i;
printf("\nres = %d : %s : '%s' : bufSize = %d, numeric", res, name, buf, (unsigned)bufSize);
for (i = 0; i < 20; i++)
printf(" %2x", (unsigned)(Byte)buf[i]);
}
}
*/
/*
Print_sysctlbyname("hw.pagesize");
Print_sysctlbyname("machdep.cpu.brand_string");
*/
static BoolInt z7_sysctlbyname_Get_BoolInt(const char *name)
{
UInt32 val = 0;
if (z7_sysctlbyname_Get_UInt32(name, &val) == 0 && val == 1)
return 1;
return 0;
}
BoolInt CPU_IsSupported_CRC32(void)
{
return z7_sysctlbyname_Get_BoolInt("hw.optional.armv8_crc32");
}
BoolInt CPU_IsSupported_NEON(void)
{
return z7_sysctlbyname_Get_BoolInt("hw.optional.neon");
}
#ifdef MY_CPU_ARM64
#define APPLE_CRYPTO_SUPPORT_VAL 1
#else
#define APPLE_CRYPTO_SUPPORT_VAL 0
#endif
BoolInt CPU_IsSupported_SHA1(void) { return APPLE_CRYPTO_SUPPORT_VAL; }
BoolInt CPU_IsSupported_SHA2(void) { return APPLE_CRYPTO_SUPPORT_VAL; }
BoolInt CPU_IsSupported_AES (void) { return APPLE_CRYPTO_SUPPORT_VAL; }
#else // __APPLE__
#include <sys/auxv.h>
#define USE_HWCAP
#ifdef USE_HWCAP
#include <asm/hwcap.h>
#define MY_HWCAP_CHECK_FUNC_2(name1, name2) \
BoolInt CPU_IsSupported_ ## name1() { return (getauxval(AT_HWCAP) & (HWCAP_ ## name2)) ? 1 : 0; }
#ifdef MY_CPU_ARM64
#define MY_HWCAP_CHECK_FUNC(name) \
MY_HWCAP_CHECK_FUNC_2(name, name)
MY_HWCAP_CHECK_FUNC_2(NEON, ASIMD)
// MY_HWCAP_CHECK_FUNC (ASIMD)
#elif defined(MY_CPU_ARM)
#define MY_HWCAP_CHECK_FUNC(name) \
BoolInt CPU_IsSupported_ ## name() { return (getauxval(AT_HWCAP2) & (HWCAP2_ ## name)) ? 1 : 0; }
MY_HWCAP_CHECK_FUNC_2(NEON, NEON)
#endif
#else // USE_HWCAP
#define MY_HWCAP_CHECK_FUNC(name) \
BoolInt CPU_IsSupported_ ## name() { return 0; }
MY_HWCAP_CHECK_FUNC(NEON)
#endif // USE_HWCAP
MY_HWCAP_CHECK_FUNC (CRC32)
MY_HWCAP_CHECK_FUNC (SHA1)
MY_HWCAP_CHECK_FUNC (SHA2)
MY_HWCAP_CHECK_FUNC (AES)
#endif // __APPLE__
#endif // _WIN32
#endif // MY_CPU_ARM_OR_ARM64
#ifdef __APPLE__
#include <sys/sysctl.h>
int z7_sysctlbyname_Get(const char *name, void *buf, size_t *bufSize)
{
return sysctlbyname(name, buf, bufSize, NULL, 0);
}
int z7_sysctlbyname_Get_UInt32(const char *name, UInt32 *val)
{
size_t bufSize = sizeof(*val);
const int res = z7_sysctlbyname_Get(name, val, &bufSize);
if (res == 0 && bufSize != sizeof(*val))
return EFAULT;
return res;
}
#endif

View File

@@ -1,33 +1,362 @@
/* CpuArch.h
2008-08-05
Igor Pavlov
Public domain */
/* CpuArch.h -- CPU specific code
2023-04-02 : Igor Pavlov : Public domain */
#ifndef __CPUARCH_H
#define __CPUARCH_H
#ifndef ZIP7_INC_CPU_ARCH_H
#define ZIP7_INC_CPU_ARCH_H
#include "7zTypes.h"
EXTERN_C_BEGIN
/*
LITTLE_ENDIAN_UNALIGN means:
1) CPU is LITTLE_ENDIAN
2) it's allowed to make unaligned memory accesses
if LITTLE_ENDIAN_UNALIGN is not defined, it means that we don't know
about these properties of platform.
MY_CPU_LE means that CPU is LITTLE ENDIAN.
MY_CPU_BE means that CPU is BIG ENDIAN.
If MY_CPU_LE and MY_CPU_BE are not defined, we don't know about ENDIANNESS of platform.
MY_CPU_LE_UNALIGN means that CPU is LITTLE ENDIAN and CPU supports unaligned memory accesses.
MY_CPU_64BIT means that processor can work with 64-bit registers.
MY_CPU_64BIT can be used to select fast code branch
MY_CPU_64BIT doesn't mean that (sizeof(void *) == 8)
*/
#if defined(_M_IX86) || defined(_M_X64) || defined(_M_AMD64) || defined(__i386__) || defined(__x86_64__)
#define LITTLE_ENDIAN_UNALIGN
#if defined(_M_X64) \
|| defined(_M_AMD64) \
|| defined(__x86_64__) \
|| defined(__AMD64__) \
|| defined(__amd64__)
#define MY_CPU_AMD64
#ifdef __ILP32__
#define MY_CPU_NAME "x32"
#define MY_CPU_SIZEOF_POINTER 4
#else
#define MY_CPU_NAME "x64"
#define MY_CPU_SIZEOF_POINTER 8
#endif
#define MY_CPU_64BIT
#endif
#ifdef LITTLE_ENDIAN_UNALIGN
#define GetUi16(p) (*(const UInt16 *)(p))
#define GetUi32(p) (*(const UInt32 *)(p))
#define GetUi64(p) (*(const UInt64 *)(p))
#define SetUi32(p, d) *(UInt32 *)(p) = (d);
#if defined(_M_IX86) \
|| defined(__i386__)
#define MY_CPU_X86
#define MY_CPU_NAME "x86"
/* #define MY_CPU_32BIT */
#define MY_CPU_SIZEOF_POINTER 4
#endif
#if defined(_M_ARM64) \
|| defined(__AARCH64EL__) \
|| defined(__AARCH64EB__) \
|| defined(__aarch64__)
#define MY_CPU_ARM64
#ifdef __ILP32__
#define MY_CPU_NAME "arm64-32"
#define MY_CPU_SIZEOF_POINTER 4
#else
#define MY_CPU_NAME "arm64"
#define MY_CPU_SIZEOF_POINTER 8
#endif
#define MY_CPU_64BIT
#endif
#if defined(_M_ARM) \
|| defined(_M_ARM_NT) \
|| defined(_M_ARMT) \
|| defined(__arm__) \
|| defined(__thumb__) \
|| defined(__ARMEL__) \
|| defined(__ARMEB__) \
|| defined(__THUMBEL__) \
|| defined(__THUMBEB__)
#define MY_CPU_ARM
#if defined(__thumb__) || defined(__THUMBEL__) || defined(_M_ARMT)
#define MY_CPU_ARMT
#define MY_CPU_NAME "armt"
#else
#define MY_CPU_ARM32
#define MY_CPU_NAME "arm"
#endif
/* #define MY_CPU_32BIT */
#define MY_CPU_SIZEOF_POINTER 4
#endif
#if defined(_M_IA64) \
|| defined(__ia64__)
#define MY_CPU_IA64
#define MY_CPU_NAME "ia64"
#define MY_CPU_64BIT
#endif
#if defined(__mips64) \
|| defined(__mips64__) \
|| (defined(__mips) && (__mips == 64 || __mips == 4 || __mips == 3))
#define MY_CPU_NAME "mips64"
#define MY_CPU_64BIT
#elif defined(__mips__)
#define MY_CPU_NAME "mips"
/* #define MY_CPU_32BIT */
#endif
#if defined(__ppc64__) \
|| defined(__powerpc64__) \
|| defined(__ppc__) \
|| defined(__powerpc__) \
|| defined(__PPC__) \
|| defined(_POWER)
#define MY_CPU_PPC_OR_PPC64
#if defined(__ppc64__) \
|| defined(__powerpc64__) \
|| defined(_LP64) \
|| defined(__64BIT__)
#ifdef __ILP32__
#define MY_CPU_NAME "ppc64-32"
#define MY_CPU_SIZEOF_POINTER 4
#else
#define MY_CPU_NAME "ppc64"
#define MY_CPU_SIZEOF_POINTER 8
#endif
#define MY_CPU_64BIT
#else
#define MY_CPU_NAME "ppc"
#define MY_CPU_SIZEOF_POINTER 4
/* #define MY_CPU_32BIT */
#endif
#endif
#if defined(__riscv) \
|| defined(__riscv__)
#if __riscv_xlen == 32
#define MY_CPU_NAME "riscv32"
#elif __riscv_xlen == 64
#define MY_CPU_NAME "riscv64"
#else
#define MY_CPU_NAME "riscv"
#endif
#endif
#if defined(MY_CPU_X86) || defined(MY_CPU_AMD64)
#define MY_CPU_X86_OR_AMD64
#endif
#if defined(MY_CPU_ARM) || defined(MY_CPU_ARM64)
#define MY_CPU_ARM_OR_ARM64
#endif
#ifdef _WIN32
#ifdef MY_CPU_ARM
#define MY_CPU_ARM_LE
#endif
#ifdef MY_CPU_ARM64
#define MY_CPU_ARM64_LE
#endif
#ifdef _M_IA64
#define MY_CPU_IA64_LE
#endif
#endif
#if defined(MY_CPU_X86_OR_AMD64) \
|| defined(MY_CPU_ARM_LE) \
|| defined(MY_CPU_ARM64_LE) \
|| defined(MY_CPU_IA64_LE) \
|| defined(__LITTLE_ENDIAN__) \
|| defined(__ARMEL__) \
|| defined(__THUMBEL__) \
|| defined(__AARCH64EL__) \
|| defined(__MIPSEL__) \
|| defined(__MIPSEL) \
|| defined(_MIPSEL) \
|| defined(__BFIN__) \
|| (defined(__BYTE_ORDER__) && (__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__))
#define MY_CPU_LE
#endif
#if defined(__BIG_ENDIAN__) \
|| defined(__ARMEB__) \
|| defined(__THUMBEB__) \
|| defined(__AARCH64EB__) \
|| defined(__MIPSEB__) \
|| defined(__MIPSEB) \
|| defined(_MIPSEB) \
|| defined(__m68k__) \
|| defined(__s390__) \
|| defined(__s390x__) \
|| defined(__zarch__) \
|| (defined(__BYTE_ORDER__) && (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__))
#define MY_CPU_BE
#endif
#if defined(MY_CPU_LE) && defined(MY_CPU_BE)
#error Stop_Compiling_Bad_Endian
#endif
#if !defined(MY_CPU_LE) && !defined(MY_CPU_BE)
#error Stop_Compiling_CPU_ENDIAN_must_be_detected_at_compile_time
#endif
#if defined(MY_CPU_32BIT) && defined(MY_CPU_64BIT)
#error Stop_Compiling_Bad_32_64_BIT
#endif
#ifdef __SIZEOF_POINTER__
#ifdef MY_CPU_SIZEOF_POINTER
#if MY_CPU_SIZEOF_POINTER != __SIZEOF_POINTER__
#error Stop_Compiling_Bad_MY_CPU_PTR_SIZE
#endif
#else
#define MY_CPU_SIZEOF_POINTER __SIZEOF_POINTER__
#endif
#endif
#if defined(MY_CPU_SIZEOF_POINTER) && (MY_CPU_SIZEOF_POINTER == 4)
#if defined (_LP64)
#error Stop_Compiling_Bad_MY_CPU_PTR_SIZE
#endif
#endif
#ifdef _MSC_VER
#if _MSC_VER >= 1300
#define MY_CPU_pragma_pack_push_1 __pragma(pack(push, 1))
#define MY_CPU_pragma_pop __pragma(pack(pop))
#else
#define MY_CPU_pragma_pack_push_1
#define MY_CPU_pragma_pop
#endif
#else
#ifdef __xlC__
#define MY_CPU_pragma_pack_push_1 _Pragma("pack(1)")
#define MY_CPU_pragma_pop _Pragma("pack()")
#else
#define MY_CPU_pragma_pack_push_1 _Pragma("pack(push, 1)")
#define MY_CPU_pragma_pop _Pragma("pack(pop)")
#endif
#endif
#ifndef MY_CPU_NAME
#ifdef MY_CPU_LE
#define MY_CPU_NAME "LE"
#elif defined(MY_CPU_BE)
#define MY_CPU_NAME "BE"
#else
/*
#define MY_CPU_NAME ""
*/
#endif
#endif
#ifdef __has_builtin
#define Z7_has_builtin(x) __has_builtin(x)
#else
#define Z7_has_builtin(x) 0
#endif
#define Z7_BSWAP32_CONST(v) \
( (((UInt32)(v) << 24) ) \
| (((UInt32)(v) << 8) & (UInt32)0xff0000) \
| (((UInt32)(v) >> 8) & (UInt32)0xff00 ) \
| (((UInt32)(v) >> 24) ))
#if defined(_MSC_VER) && (_MSC_VER >= 1300)
#include <stdlib.h>
/* Note: these macros will use bswap instruction (486), that is unsupported in 386 cpu */
#pragma intrinsic(_byteswap_ushort)
#pragma intrinsic(_byteswap_ulong)
#pragma intrinsic(_byteswap_uint64)
#define Z7_BSWAP16(v) _byteswap_ushort(v)
#define Z7_BSWAP32(v) _byteswap_ulong (v)
#define Z7_BSWAP64(v) _byteswap_uint64(v)
#define Z7_CPU_FAST_BSWAP_SUPPORTED
#elif (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3))) \
|| (defined(__clang__) && Z7_has_builtin(__builtin_bswap16))
#define Z7_BSWAP16(v) __builtin_bswap16(v)
#define Z7_BSWAP32(v) __builtin_bswap32(v)
#define Z7_BSWAP64(v) __builtin_bswap64(v)
#define Z7_CPU_FAST_BSWAP_SUPPORTED
#else
#define GetUi16(p) (((const Byte *)(p))[0] | ((UInt16)((const Byte *)(p))[1] << 8))
#define Z7_BSWAP16(v) ((UInt16) \
( ((UInt32)(v) << 8) \
| ((UInt32)(v) >> 8) \
))
#define Z7_BSWAP32(v) Z7_BSWAP32_CONST(v)
#define Z7_BSWAP64(v) \
( ( ( (UInt64)(v) ) << 8 * 7 ) \
| ( ( (UInt64)(v) & ((UInt32)0xff << 8 * 1) ) << 8 * 5 ) \
| ( ( (UInt64)(v) & ((UInt32)0xff << 8 * 2) ) << 8 * 3 ) \
| ( ( (UInt64)(v) & ((UInt32)0xff << 8 * 3) ) << 8 * 1 ) \
| ( ( (UInt64)(v) >> 8 * 1 ) & ((UInt32)0xff << 8 * 3) ) \
| ( ( (UInt64)(v) >> 8 * 3 ) & ((UInt32)0xff << 8 * 2) ) \
| ( ( (UInt64)(v) >> 8 * 5 ) & ((UInt32)0xff << 8 * 1) ) \
| ( ( (UInt64)(v) >> 8 * 7 ) ) \
)
#endif
#ifdef MY_CPU_LE
#if defined(MY_CPU_X86_OR_AMD64) \
|| defined(MY_CPU_ARM64)
#define MY_CPU_LE_UNALIGN
#define MY_CPU_LE_UNALIGN_64
#elif defined(__ARM_FEATURE_UNALIGNED)
/* gcc9 for 32-bit arm can use LDRD instruction that requires 32-bit alignment.
So we can't use unaligned 64-bit operations. */
#define MY_CPU_LE_UNALIGN
#endif
#endif
#ifdef MY_CPU_LE_UNALIGN
#define GetUi16(p) (*(const UInt16 *)(const void *)(p))
#define GetUi32(p) (*(const UInt32 *)(const void *)(p))
#ifdef MY_CPU_LE_UNALIGN_64
#define GetUi64(p) (*(const UInt64 *)(const void *)(p))
#define SetUi64(p, v) { *(UInt64 *)(void *)(p) = (v); }
#endif
#define SetUi16(p, v) { *(UInt16 *)(void *)(p) = (v); }
#define SetUi32(p, v) { *(UInt32 *)(void *)(p) = (v); }
#else
#define GetUi16(p) ( (UInt16) ( \
((const Byte *)(p))[0] | \
((UInt16)((const Byte *)(p))[1] << 8) ))
#define GetUi32(p) ( \
((const Byte *)(p))[0] | \
@@ -35,22 +364,38 @@ about these properties of platform.
((UInt32)((const Byte *)(p))[2] << 16) | \
((UInt32)((const Byte *)(p))[3] << 24))
#define GetUi64(p) (GetUi32(p) | ((UInt64)GetUi32(((const Byte *)(p)) + 4) << 32))
#define SetUi16(p, v) { Byte *_ppp_ = (Byte *)(p); UInt32 _vvv_ = (v); \
_ppp_[0] = (Byte)_vvv_; \
_ppp_[1] = (Byte)(_vvv_ >> 8); }
#define SetUi32(p, d) { UInt32 _x_ = (d); \
((Byte *)(p))[0] = (Byte)_x_; \
((Byte *)(p))[1] = (Byte)(_x_ >> 8); \
((Byte *)(p))[2] = (Byte)(_x_ >> 16); \
((Byte *)(p))[3] = (Byte)(_x_ >> 24); }
#define SetUi32(p, v) { Byte *_ppp_ = (Byte *)(p); UInt32 _vvv_ = (v); \
_ppp_[0] = (Byte)_vvv_; \
_ppp_[1] = (Byte)(_vvv_ >> 8); \
_ppp_[2] = (Byte)(_vvv_ >> 16); \
_ppp_[3] = (Byte)(_vvv_ >> 24); }
#endif
#if defined(LITTLE_ENDIAN_UNALIGN) && defined(_WIN64) && (_MSC_VER >= 1300)
#pragma intrinsic(_byteswap_ulong)
#pragma intrinsic(_byteswap_uint64)
#define GetBe32(p) _byteswap_ulong(*(const UInt32 *)(const Byte *)(p))
#define GetBe64(p) _byteswap_uint64(*(const UInt64 *)(const Byte *)(p))
#ifndef GetUi64
#define GetUi64(p) (GetUi32(p) | ((UInt64)GetUi32(((const Byte *)(p)) + 4) << 32))
#endif
#ifndef SetUi64
#define SetUi64(p, v) { Byte *_ppp2_ = (Byte *)(p); UInt64 _vvv2_ = (v); \
SetUi32(_ppp2_ , (UInt32)_vvv2_) \
SetUi32(_ppp2_ + 4, (UInt32)(_vvv2_ >> 32)) }
#endif
#if defined(MY_CPU_LE_UNALIGN) && defined(Z7_CPU_FAST_BSWAP_SUPPORTED)
#define GetBe32(p) Z7_BSWAP32 (*(const UInt32 *)(const void *)(p))
#define SetBe32(p, v) { (*(UInt32 *)(void *)(p)) = Z7_BSWAP32(v); }
#if defined(MY_CPU_LE_UNALIGN_64)
#define GetBe64(p) Z7_BSWAP64 (*(const UInt64 *)(const void *)(p))
#endif
#else
@@ -60,10 +405,119 @@ about these properties of platform.
((UInt32)((const Byte *)(p))[2] << 8) | \
((const Byte *)(p))[3] )
#define SetBe32(p, v) { Byte *_ppp_ = (Byte *)(p); UInt32 _vvv_ = (v); \
_ppp_[0] = (Byte)(_vvv_ >> 24); \
_ppp_[1] = (Byte)(_vvv_ >> 16); \
_ppp_[2] = (Byte)(_vvv_ >> 8); \
_ppp_[3] = (Byte)_vvv_; }
#endif
#ifndef GetBe64
#define GetBe64(p) (((UInt64)GetBe32(p) << 32) | GetBe32(((const Byte *)(p)) + 4))
#endif
#ifndef GetBe16
#define GetBe16(p) ( (UInt16) ( \
((UInt16)((const Byte *)(p))[0] << 8) | \
((const Byte *)(p))[1] ))
#endif
#if defined(MY_CPU_BE)
#define Z7_CONV_BE_TO_NATIVE_CONST32(v) (v)
#define Z7_CONV_LE_TO_NATIVE_CONST32(v) Z7_BSWAP32_CONST(v)
#define Z7_CONV_NATIVE_TO_BE_32(v) (v)
#elif defined(MY_CPU_LE)
#define Z7_CONV_BE_TO_NATIVE_CONST32(v) Z7_BSWAP32_CONST(v)
#define Z7_CONV_LE_TO_NATIVE_CONST32(v) (v)
#define Z7_CONV_NATIVE_TO_BE_32(v) Z7_BSWAP32(v)
#else
#error Stop_Compiling_Unknown_Endian_CONV
#endif
#if defined(MY_CPU_BE)
#define GetBe32a(p) (*(const UInt32 *)(const void *)(p))
#define GetBe16a(p) (*(const UInt16 *)(const void *)(p))
#define SetBe32a(p, v) { *(UInt32 *)(void *)(p) = (v); }
#define SetBe16a(p, v) { *(UInt16 *)(void *)(p) = (v); }
#define GetUi32a(p) GetUi32(p)
#define GetUi16a(p) GetUi16(p)
#define SetUi32a(p, v) SetUi32(p, v)
#define SetUi16a(p, v) SetUi16(p, v)
#elif defined(MY_CPU_LE)
#define GetUi32a(p) (*(const UInt32 *)(const void *)(p))
#define GetUi16a(p) (*(const UInt16 *)(const void *)(p))
#define SetUi32a(p, v) { *(UInt32 *)(void *)(p) = (v); }
#define SetUi16a(p, v) { *(UInt16 *)(void *)(p) = (v); }
#define GetBe32a(p) GetBe32(p)
#define GetBe16a(p) GetBe16(p)
#define SetBe32a(p, v) SetBe32(p, v)
#define SetBe16a(p, v) SetBe16(p, v)
#else
#error Stop_Compiling_Unknown_Endian_CPU_a
#endif
#if defined(MY_CPU_X86_OR_AMD64) \
|| defined(MY_CPU_ARM_OR_ARM64) \
|| defined(MY_CPU_PPC_OR_PPC64)
#define Z7_CPU_FAST_ROTATE_SUPPORTED
#endif
#ifdef MY_CPU_X86_OR_AMD64
void Z7_FASTCALL z7_x86_cpuid(UInt32 a[4], UInt32 function);
UInt32 Z7_FASTCALL z7_x86_cpuid_GetMaxFunc(void);
#if defined(MY_CPU_AMD64)
#define Z7_IF_X86_CPUID_SUPPORTED
#else
#define Z7_IF_X86_CPUID_SUPPORTED if (z7_x86_cpuid_GetMaxFunc())
#endif
BoolInt CPU_IsSupported_AES(void);
BoolInt CPU_IsSupported_AVX(void);
BoolInt CPU_IsSupported_AVX2(void);
BoolInt CPU_IsSupported_VAES_AVX2(void);
BoolInt CPU_IsSupported_CMOV(void);
BoolInt CPU_IsSupported_SSE(void);
BoolInt CPU_IsSupported_SSE2(void);
BoolInt CPU_IsSupported_SSSE3(void);
BoolInt CPU_IsSupported_SSE41(void);
BoolInt CPU_IsSupported_SHA(void);
BoolInt CPU_IsSupported_PageGB(void);
#elif defined(MY_CPU_ARM_OR_ARM64)
BoolInt CPU_IsSupported_CRC32(void);
BoolInt CPU_IsSupported_NEON(void);
#if defined(_WIN32)
BoolInt CPU_IsSupported_CRYPTO(void);
#define CPU_IsSupported_SHA1 CPU_IsSupported_CRYPTO
#define CPU_IsSupported_SHA2 CPU_IsSupported_CRYPTO
#define CPU_IsSupported_AES CPU_IsSupported_CRYPTO
#else
BoolInt CPU_IsSupported_SHA1(void);
BoolInt CPU_IsSupported_SHA2(void);
BoolInt CPU_IsSupported_AES(void);
#endif
#endif
#define GetBe16(p) (((UInt16)((const Byte *)(p))[0] << 8) | ((const Byte *)(p))[1])
#if defined(__APPLE__)
int z7_sysctlbyname_Get(const char *name, void *buf, size_t *bufSize);
int z7_sysctlbyname_Get_UInt32(const char *name, UInt32 *val);
#endif
EXTERN_C_END
#endif

169
C/Delta.c Executable file
View File

@@ -0,0 +1,169 @@
/* Delta.c -- Delta converter
2021-02-09 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include "Delta.h"
void Delta_Init(Byte *state)
{
unsigned i;
for (i = 0; i < DELTA_STATE_SIZE; i++)
state[i] = 0;
}
void Delta_Encode(Byte *state, unsigned delta, Byte *data, SizeT size)
{
Byte temp[DELTA_STATE_SIZE];
if (size == 0)
return;
{
unsigned i = 0;
do
temp[i] = state[i];
while (++i != delta);
}
if (size <= delta)
{
unsigned i = 0, k;
do
{
Byte b = *data;
*data++ = (Byte)(b - temp[i]);
temp[i] = b;
}
while (++i != size);
k = 0;
do
{
if (i == delta)
i = 0;
state[k] = temp[i++];
}
while (++k != delta);
return;
}
{
Byte *p = data + size - delta;
{
unsigned i = 0;
do
state[i] = *p++;
while (++i != delta);
}
{
const Byte *lim = data + delta;
ptrdiff_t dif = -(ptrdiff_t)delta;
if (((ptrdiff_t)size + dif) & 1)
{
--p; *p = (Byte)(*p - p[dif]);
}
while (p != lim)
{
--p; *p = (Byte)(*p - p[dif]);
--p; *p = (Byte)(*p - p[dif]);
}
dif = -dif;
do
{
--p; *p = (Byte)(*p - temp[--dif]);
}
while (dif != 0);
}
}
}
void Delta_Decode(Byte *state, unsigned delta, Byte *data, SizeT size)
{
unsigned i;
const Byte *lim;
if (size == 0)
return;
i = 0;
lim = data + size;
if (size <= delta)
{
do
*data = (Byte)(*data + state[i++]);
while (++data != lim);
for (; delta != i; state++, delta--)
*state = state[i];
data -= i;
}
else
{
/*
#define B(n) b ## n
#define I(n) Byte B(n) = state[n];
#define U(n) { B(n) = (Byte)((B(n)) + *data++); data[-1] = (B(n)); }
#define F(n) if (data != lim) { U(n) }
if (delta == 1)
{
I(0)
if ((lim - data) & 1) { U(0) }
while (data != lim) { U(0) U(0) }
data -= 1;
}
else if (delta == 2)
{
I(0) I(1)
lim -= 1; while (data < lim) { U(0) U(1) }
lim += 1; F(0)
data -= 2;
}
else if (delta == 3)
{
I(0) I(1) I(2)
lim -= 2; while (data < lim) { U(0) U(1) U(2) }
lim += 2; F(0) F(1)
data -= 3;
}
else if (delta == 4)
{
I(0) I(1) I(2) I(3)
lim -= 3; while (data < lim) { U(0) U(1) U(2) U(3) }
lim += 3; F(0) F(1) F(2)
data -= 4;
}
else
*/
{
do
{
*data = (Byte)(*data + state[i++]);
data++;
}
while (i != delta);
{
ptrdiff_t dif = -(ptrdiff_t)delta;
do
*data = (Byte)(*data + data[dif]);
while (++data != lim);
data += dif;
}
}
}
do
*state++ = *data;
while (++data != lim);
}

19
C/Delta.h Executable file
View File

@@ -0,0 +1,19 @@
/* Delta.h -- Delta converter
2023-03-03 : Igor Pavlov : Public domain */
#ifndef ZIP7_INC_DELTA_H
#define ZIP7_INC_DELTA_H
#include "7zTypes.h"
EXTERN_C_BEGIN
#define DELTA_STATE_SIZE 256
void Delta_Init(Byte *state);
void Delta_Encode(Byte *state, unsigned delta, Byte *data, SizeT size);
void Delta_Decode(Byte *state, unsigned delta, Byte *data, SizeT size);
EXTERN_C_END
#endif

111
C/DllSecur.c Executable file
View File

@@ -0,0 +1,111 @@
/* DllSecur.c -- DLL loading security
2023-04-02 : Igor Pavlov : Public domain */
#include "Precomp.h"
#ifdef _WIN32
#include "7zWindows.h"
#include "DllSecur.h"
#ifndef UNDER_CE
#if (defined(__GNUC__) && (__GNUC__ >= 8)) || defined(__clang__)
// #pragma GCC diagnostic ignored "-Wcast-function-type"
#endif
#if defined(__clang__) || defined(__GNUC__)
typedef void (*Z7_voidFunction)(void);
#define MY_CAST_FUNC (Z7_voidFunction)
#elif defined(_MSC_VER) && _MSC_VER > 1920
#define MY_CAST_FUNC (void *)
// #pragma warning(disable : 4191) // 'type cast': unsafe conversion from 'FARPROC' to 'void (__cdecl *)()'
#else
#define MY_CAST_FUNC
#endif
typedef BOOL (WINAPI *Func_SetDefaultDllDirectories)(DWORD DirectoryFlags);
#define MY_LOAD_LIBRARY_SEARCH_USER_DIRS 0x400
#define MY_LOAD_LIBRARY_SEARCH_SYSTEM32 0x800
#define DELIM "\0"
static const char * const g_Dlls =
"userenv"
DELIM "setupapi"
DELIM "apphelp"
DELIM "propsys"
DELIM "dwmapi"
DELIM "cryptbase"
DELIM "oleacc"
DELIM "clbcatq"
DELIM "version"
#ifndef _CONSOLE
DELIM "uxtheme"
#endif
DELIM;
#endif
#ifdef __clang__
#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
#endif
#if defined (_MSC_VER) && _MSC_VER >= 1900
// sysinfoapi.h: kit10: GetVersion was declared deprecated
#pragma warning(disable : 4996)
#endif
#define IF_NON_VISTA_SET_DLL_DIRS_AND_RETURN \
if ((UInt16)GetVersion() != 6) { \
const \
Func_SetDefaultDllDirectories setDllDirs = \
(Func_SetDefaultDllDirectories) MY_CAST_FUNC GetProcAddress(GetModuleHandle(TEXT("kernel32.dll")), \
"SetDefaultDllDirectories"); \
if (setDllDirs) if (setDllDirs(MY_LOAD_LIBRARY_SEARCH_SYSTEM32 | MY_LOAD_LIBRARY_SEARCH_USER_DIRS)) return; }
void My_SetDefaultDllDirectories(void)
{
#ifndef UNDER_CE
IF_NON_VISTA_SET_DLL_DIRS_AND_RETURN
#endif
}
void LoadSecurityDlls(void)
{
#ifndef UNDER_CE
// at Vista (ver 6.0) : CoCreateInstance(CLSID_ShellLink, ...) doesn't work after SetDefaultDllDirectories() : Check it ???
IF_NON_VISTA_SET_DLL_DIRS_AND_RETURN
{
wchar_t buf[MAX_PATH + 100];
const char *dll;
unsigned pos = GetSystemDirectoryW(buf, MAX_PATH + 2);
if (pos == 0 || pos > MAX_PATH)
return;
if (buf[pos - 1] != '\\')
buf[pos++] = '\\';
for (dll = g_Dlls; *dll != 0;)
{
wchar_t *dest = &buf[pos];
for (;;)
{
const char c = *dll++;
if (c == 0)
break;
*dest++ = (Byte)c;
}
dest[0] = '.';
dest[1] = 'd';
dest[2] = 'l';
dest[3] = 'l';
dest[4] = 0;
// lstrcatW(buf, L".dll");
LoadLibraryExW(buf, NULL, LOAD_WITH_ALTERED_SEARCH_PATH);
}
}
#endif
}
#endif // _WIN32

20
C/DllSecur.h Executable file
View File

@@ -0,0 +1,20 @@
/* DllSecur.h -- DLL loading for security
2023-03-03 : Igor Pavlov : Public domain */
#ifndef ZIP7_INC_DLL_SECUR_H
#define ZIP7_INC_DLL_SECUR_H
#include "7zTypes.h"
EXTERN_C_BEGIN
#ifdef _WIN32
void My_SetDefaultDllDirectories(void);
void LoadSecurityDlls(void);
#endif
EXTERN_C_END
#endif

View File

@@ -1,14 +1,14 @@
/* HuffEnc.c -- functions for Huffman encoding
2008-08-05
Igor Pavlov
Public domain */
2023-03-04 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include "HuffEnc.h"
#include "Sort.h"
#define kMaxLen 16
#define NUM_BITS 10
#define MASK ((1 << NUM_BITS) - 1)
#define MASK (((unsigned)1 << NUM_BITS) - 1)
#define NUM_COUNTERS 64
@@ -67,11 +67,11 @@ void Huffman_Generate(const UInt32 *freqs, UInt32 *p, Byte *lens, UInt32 numSymb
if (num < 2)
{
int minCode = 0;
int maxCode = 1;
unsigned minCode = 0;
unsigned maxCode = 1;
if (num == 1)
{
maxCode = (int)(p[0] & MASK);
maxCode = (unsigned)p[0] & MASK;
if (maxCode == 0)
maxCode++;
}
@@ -106,14 +106,14 @@ void Huffman_Generate(const UInt32 *freqs, UInt32 *p, Byte *lens, UInt32 numSymb
p[--e] &= MASK;
lenCounters[1] = 2;
while (e > 0)
while (e != 0)
{
UInt32 len = (p[p[--e] >> NUM_BITS] >> NUM_BITS) + 1;
p[e] = (p[e] & MASK) | (len << NUM_BITS);
if (len >= maxLen)
for (len = maxLen - 1; lenCounters[len] == 0; len--);
lenCounters[len]--;
lenCounters[len + 1] += 2;
lenCounters[(size_t)len + 1] += 2;
}
{
@@ -121,8 +121,8 @@ void Huffman_Generate(const UInt32 *freqs, UInt32 *p, Byte *lens, UInt32 numSymb
i = 0;
for (len = maxLen; len != 0; len--)
{
UInt32 num;
for (num = lenCounters[len]; num != 0; num--)
UInt32 k;
for (k = lenCounters[len]; k != 0; k--)
lens[p[i++] & MASK] = (Byte)len;
}
}
@@ -133,16 +133,22 @@ void Huffman_Generate(const UInt32 *freqs, UInt32 *p, Byte *lens, UInt32 numSymb
UInt32 code = 0;
UInt32 len;
for (len = 1; len <= kMaxLen; len++)
nextCodes[len] = code = (code + lenCounters[len - 1]) << 1;
nextCodes[len] = code = (code + lenCounters[(size_t)len - 1]) << 1;
}
/* if (code + lenCounters[kMaxLen] - 1 != (1 << kMaxLen) - 1) throw 1; */
{
UInt32 i;
for (i = 0; i < numSymbols; i++)
p[i] = nextCodes[lens[i]]++;
UInt32 k;
for (k = 0; k < numSymbols; k++)
p[k] = nextCodes[lens[k]]++;
}
}
}
}
}
#undef kMaxLen
#undef NUM_BITS
#undef MASK
#undef NUM_COUNTERS
#undef HUFFMAN_SPEED_OPT

View File

@@ -1,12 +1,12 @@
/* HuffEnc.h -- functions for Huffman encoding
2008-03-26
Igor Pavlov
Public domain */
/* HuffEnc.h -- Huffman encoding
2023-03-05 : Igor Pavlov : Public domain */
#ifndef __HUFFENC_H
#define __HUFFENC_H
#ifndef ZIP7_INC_HUFF_ENC_H
#define ZIP7_INC_HUFF_ENC_H
#include "Types.h"
#include "7zTypes.h"
EXTERN_C_BEGIN
/*
Conditions:
@@ -18,4 +18,6 @@ Conditions:
void Huffman_Generate(const UInt32 *freqs, UInt32 *p, Byte *lens, UInt32 num, UInt32 maxLen);
EXTERN_C_END
#endif

1764
C/LzFind.c
View File

File diff suppressed because it is too large Load Diff

View File

@@ -1,76 +1,121 @@
/* LzFind.h -- Match finder for LZ algorithms
2008-10-04 : Igor Pavlov : Public domain */
2023-03-04 : Igor Pavlov : Public domain */
#ifndef __LZFIND_H
#define __LZFIND_H
#ifndef ZIP7_INC_LZ_FIND_H
#define ZIP7_INC_LZ_FIND_H
#include "Types.h"
#include "7zTypes.h"
EXTERN_C_BEGIN
typedef UInt32 CLzRef;
typedef struct _CMatchFinder
typedef struct
{
Byte *buffer;
const Byte *buffer;
UInt32 pos;
UInt32 posLimit;
UInt32 streamPos;
UInt32 streamPos; /* wrap over Zero is allowed (streamPos < pos). Use (UInt32)(streamPos - pos) */
UInt32 lenLimit;
UInt32 cyclicBufferPos;
UInt32 cyclicBufferSize; /* it must be = (historySize + 1) */
Byte streamEndWasReached;
Byte btMode;
Byte bigHash;
Byte directInput;
UInt32 matchMaxLen;
CLzRef *hash;
CLzRef *son;
UInt32 hashMask;
UInt32 cutValue;
Byte *bufferBase;
ISeqInStream *stream;
int streamEndWasReached;
Byte *bufBase;
ISeqInStreamPtr stream;
UInt32 blockSize;
UInt32 keepSizeBefore;
UInt32 keepSizeAfter;
UInt32 numHashBytes;
int directInput;
int btMode;
/* int skipModeBits; */
int bigHash;
size_t directInputRem;
UInt32 historySize;
UInt32 fixedHashSize;
UInt32 hashSizeSum;
UInt32 numSons;
Byte numHashBytes_Min;
Byte numHashOutBits;
Byte _pad2_[2];
SRes result;
UInt32 crc[256];
size_t numRefs;
UInt64 expectedDataSize;
} CMatchFinder;
#define Inline_MatchFinder_GetPointerToCurrentPos(p) ((p)->buffer)
#define Inline_MatchFinder_GetIndexByte(p, index) ((p)->buffer[(Int32)(index)])
#define Inline_MatchFinder_GetPointerToCurrentPos(p) ((const Byte *)(p)->buffer)
#define Inline_MatchFinder_GetNumAvailableBytes(p) ((p)->streamPos - (p)->pos)
#define Inline_MatchFinder_GetNumAvailableBytes(p) ((UInt32)((p)->streamPos - (p)->pos))
/*
#define Inline_MatchFinder_IsFinishedOK(p) \
((p)->streamEndWasReached \
&& (p)->streamPos == (p)->pos \
&& (!(p)->directInput || (p)->directInputRem == 0))
*/
int MatchFinder_NeedMove(CMatchFinder *p);
Byte *MatchFinder_GetPointerToCurrentPos(CMatchFinder *p);
/* Byte *MatchFinder_GetPointerToCurrentPos(CMatchFinder *p); */
void MatchFinder_MoveBlock(CMatchFinder *p);
void MatchFinder_ReadIfRequired(CMatchFinder *p);
void MatchFinder_Construct(CMatchFinder *p);
/* Conditions:
historySize <= 3 GB
keepAddBufferBefore + matchMaxLen + keepAddBufferAfter < 511MB
/* (directInput = 0) is default value.
It's required to provide correct (directInput) value
before calling MatchFinder_Create().
You can set (directInput) by any of the following calls:
- MatchFinder_SET_DIRECT_INPUT_BUF()
- MatchFinder_SET_STREAM()
- MatchFinder_SET_STREAM_MODE()
*/
#define MatchFinder_SET_DIRECT_INPUT_BUF(p, _src_, _srcLen_) { \
(p)->stream = NULL; \
(p)->directInput = 1; \
(p)->buffer = (_src_); \
(p)->directInputRem = (_srcLen_); }
/*
#define MatchFinder_SET_STREAM_MODE(p) { \
(p)->directInput = 0; }
*/
#define MatchFinder_SET_STREAM(p, _stream_) { \
(p)->stream = _stream_; \
(p)->directInput = 0; }
int MatchFinder_Create(CMatchFinder *p, UInt32 historySize,
UInt32 keepAddBufferBefore, UInt32 matchMaxLen, UInt32 keepAddBufferAfter,
ISzAlloc *alloc);
void MatchFinder_Free(CMatchFinder *p, ISzAlloc *alloc);
void MatchFinder_Normalize3(UInt32 subValue, CLzRef *items, UInt32 numItems);
void MatchFinder_ReduceOffsets(CMatchFinder *p, UInt32 subValue);
ISzAllocPtr alloc);
void MatchFinder_Free(CMatchFinder *p, ISzAllocPtr alloc);
void MatchFinder_Normalize3(UInt32 subValue, CLzRef *items, size_t numItems);
/*
#define MatchFinder_INIT_POS(p, val) \
(p)->pos = (val); \
(p)->streamPos = (val);
*/
// void MatchFinder_ReduceOffsets(CMatchFinder *p, UInt32 subValue);
#define MatchFinder_REDUCE_OFFSETS(p, subValue) \
(p)->pos -= (subValue); \
(p)->streamPos -= (subValue);
UInt32 * GetMatchesSpec1(UInt32 lenLimit, UInt32 curMatch, UInt32 pos, const Byte *buffer, CLzRef *son,
UInt32 _cyclicBufferPos, UInt32 _cyclicBufferSize, UInt32 _cutValue,
size_t _cyclicBufferPos, UInt32 _cyclicBufferSize, UInt32 _cutValue,
UInt32 *distances, UInt32 maxLen);
/*
@@ -80,28 +125,35 @@ Conditions:
*/
typedef void (*Mf_Init_Func)(void *object);
typedef Byte (*Mf_GetIndexByte_Func)(void *object, Int32 index);
typedef UInt32 (*Mf_GetNumAvailableBytes_Func)(void *object);
typedef const Byte * (*Mf_GetPointerToCurrentPos_Func)(void *object);
typedef UInt32 (*Mf_GetMatches_Func)(void *object, UInt32 *distances);
typedef UInt32 * (*Mf_GetMatches_Func)(void *object, UInt32 *distances);
typedef void (*Mf_Skip_Func)(void *object, UInt32);
typedef struct _IMatchFinder
typedef struct
{
Mf_Init_Func Init;
Mf_GetIndexByte_Func GetIndexByte;
Mf_GetNumAvailableBytes_Func GetNumAvailableBytes;
Mf_GetPointerToCurrentPos_Func GetPointerToCurrentPos;
Mf_GetMatches_Func GetMatches;
Mf_Skip_Func Skip;
} IMatchFinder;
} IMatchFinder2;
void MatchFinder_CreateVTable(CMatchFinder *p, IMatchFinder *vTable);
void MatchFinder_CreateVTable(CMatchFinder *p, IMatchFinder2 *vTable);
void MatchFinder_Init_LowHash(CMatchFinder *p);
void MatchFinder_Init_HighHash(CMatchFinder *p);
void MatchFinder_Init_4(CMatchFinder *p);
void MatchFinder_Init(CMatchFinder *p);
UInt32 Bt3Zip_MatchFinder_GetMatches(CMatchFinder *p, UInt32 *distances);
UInt32 Hc3Zip_MatchFinder_GetMatches(CMatchFinder *p, UInt32 *distances);
UInt32* Bt3Zip_MatchFinder_GetMatches(CMatchFinder *p, UInt32 *distances);
UInt32* Hc3Zip_MatchFinder_GetMatches(CMatchFinder *p, UInt32 *distances);
void Bt3Zip_MatchFinder_Skip(CMatchFinder *p, UInt32 num);
void Hc3Zip_MatchFinder_Skip(CMatchFinder *p, UInt32 num);
void LzFindPrepare(void);
EXTERN_C_END
#endif

View File

File diff suppressed because it is too large Load Diff

View File

@@ -1,37 +1,34 @@
/* LzFindMt.h -- multithreaded Match finder for LZ algorithms
2008-10-04 : Igor Pavlov : Public domain */
2023-03-05 : Igor Pavlov : Public domain */
#ifndef __LZFINDMT_H
#define __LZFINDMT_H
#ifndef ZIP7_INC_LZ_FIND_MT_H
#define ZIP7_INC_LZ_FIND_MT_H
#include "Threads.h"
#include "LzFind.h"
#include "Threads.h"
#define kMtHashBlockSize (1 << 13)
#define kMtHashNumBlocks (1 << 3)
#define kMtHashNumBlocksMask (kMtHashNumBlocks - 1)
EXTERN_C_BEGIN
#define kMtBtBlockSize (1 << 14)
#define kMtBtNumBlocks (1 << 6)
#define kMtBtNumBlocksMask (kMtBtNumBlocks - 1)
typedef struct _CMtSync
typedef struct
{
Bool wasCreated;
Bool needStart;
Bool exit;
Bool stopWriting;
UInt32 numProcessedBlocks;
CThread thread;
UInt64 affinity;
BoolInt wasCreated;
BoolInt needStart;
BoolInt csWasInitialized;
BoolInt csWasEntered;
BoolInt exit;
BoolInt stopWriting;
CAutoResetEvent canStart;
CAutoResetEvent wasStarted;
CAutoResetEvent wasStopped;
CSemaphore freeSemaphore;
CSemaphore filledSemaphore;
Bool csWasInitialized;
Bool csWasEntered;
CCriticalSection cs;
UInt32 numProcessedBlocks;
// UInt32 numBlocks_Sent;
} CMtSync;
typedef UInt32 * (*Mf_Mix_Matches)(void *p, UInt32 matchMinPos, UInt32 *distances);
@@ -42,23 +39,28 @@ typedef UInt32 * (*Mf_Mix_Matches)(void *p, UInt32 matchMinPos, UInt32 *distance
typedef void (*Mf_GetHeads)(const Byte *buffer, UInt32 pos,
UInt32 *hash, UInt32 hashMask, UInt32 *heads, UInt32 numHeads, const UInt32 *crc);
typedef struct _CMatchFinderMt
typedef struct
{
/* LZ */
const Byte *pointerToCurPos;
UInt32 *btBuf;
UInt32 btBufPos;
UInt32 btBufPosLimit;
const UInt32 *btBufPos;
const UInt32 *btBufPosLimit;
UInt32 lzPos;
UInt32 btNumAvailBytes;
UInt32 *hash;
UInt32 fixedHashSize;
// UInt32 hash4Mask;
UInt32 historySize;
const UInt32 *crc;
Mf_Mix_Matches MixMatchesFunc;
UInt32 failure_LZ_BT; // failure in BT transfered to LZ
// UInt32 failure_LZ_LZ; // failure in LZ tables
UInt32 failureBuf[1];
// UInt32 crc[256];
/* LZ + BT */
CMtSync btSync;
Byte btDummy[kMtCacheLineDummy];
@@ -68,14 +70,16 @@ typedef struct _CMatchFinderMt
UInt32 hashBufPos;
UInt32 hashBufPosLimit;
UInt32 hashNumAvail;
UInt32 failure_BT;
CLzRef *son;
UInt32 matchMaxLen;
UInt32 numHashBytes;
UInt32 pos;
Byte *buffer;
const Byte *buffer;
UInt32 cyclicBufferPos;
UInt32 cyclicBufferSize; /* it must be historySize + 1 */
UInt32 cyclicBufferSize; /* it must be = (historySize + 1) */
UInt32 cutValue;
/* BT + Hash */
@@ -85,13 +89,21 @@ typedef struct _CMatchFinderMt
/* Hash */
Mf_GetHeads GetHeadsFunc;
CMatchFinder *MatchFinder;
// CMatchFinder MatchFinder;
} CMatchFinderMt;
// only for Mt part
void MatchFinderMt_Construct(CMatchFinderMt *p);
void MatchFinderMt_Destruct(CMatchFinderMt *p, ISzAlloc *alloc);
void MatchFinderMt_Destruct(CMatchFinderMt *p, ISzAllocPtr alloc);
SRes MatchFinderMt_Create(CMatchFinderMt *p, UInt32 historySize, UInt32 keepAddBufferBefore,
UInt32 matchMaxLen, UInt32 keepAddBufferAfter, ISzAlloc *alloc);
void MatchFinderMt_CreateVTable(CMatchFinderMt *p, IMatchFinder *vTable);
UInt32 matchMaxLen, UInt32 keepAddBufferAfter, ISzAllocPtr alloc);
void MatchFinderMt_CreateVTable(CMatchFinderMt *p, IMatchFinder2 *vTable);
/* call MatchFinderMt_InitMt() before IMatchFinder::Init() */
SRes MatchFinderMt_InitMt(CMatchFinderMt *p);
void MatchFinderMt_ReleaseStream(CMatchFinderMt *p);
EXTERN_C_END
#endif

578
C/LzFindOpt.c Executable file
View File

@@ -0,0 +1,578 @@
/* LzFindOpt.c -- multithreaded Match finder for LZ algorithms
2023-04-02 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include "CpuArch.h"
#include "LzFind.h"
// #include "LzFindMt.h"
// #define LOG_ITERS
// #define LOG_THREAD
#ifdef LOG_THREAD
#include <stdio.h>
#define PRF(x) x
#else
// #define PRF(x)
#endif
#ifdef LOG_ITERS
#include <stdio.h>
UInt64 g_NumIters_Tree;
UInt64 g_NumIters_Loop;
UInt64 g_NumIters_Bytes;
#define LOG_ITER(x) x
#else
#define LOG_ITER(x)
#endif
// ---------- BT THREAD ----------
#define USE_SON_PREFETCH
#define USE_LONG_MATCH_OPT
#define kEmptyHashValue 0
// #define CYC_TO_POS_OFFSET 0
// #define CYC_TO_POS_OFFSET 1 // for debug
/*
Z7_NO_INLINE
UInt32 * Z7_FASTCALL GetMatchesSpecN_1(const Byte *lenLimit, size_t pos, const Byte *cur, CLzRef *son,
UInt32 _cutValue, UInt32 *d, size_t _maxLen, const UInt32 *hash, const UInt32 *limit, const UInt32 *size, UInt32 *posRes)
{
do
{
UInt32 delta;
if (hash == size)
break;
delta = *hash++;
if (delta == 0 || delta > (UInt32)pos)
return NULL;
lenLimit++;
if (delta == (UInt32)pos)
{
CLzRef *ptr1 = son + ((size_t)pos << 1) - CYC_TO_POS_OFFSET * 2;
*d++ = 0;
ptr1[0] = kEmptyHashValue;
ptr1[1] = kEmptyHashValue;
}
else
{
UInt32 *_distances = ++d;
CLzRef *ptr0 = son + ((size_t)(pos) << 1) - CYC_TO_POS_OFFSET * 2 + 1;
CLzRef *ptr1 = son + ((size_t)(pos) << 1) - CYC_TO_POS_OFFSET * 2;
const Byte *len0 = cur, *len1 = cur;
UInt32 cutValue = _cutValue;
const Byte *maxLen = cur + _maxLen;
for (LOG_ITER(g_NumIters_Tree++);;)
{
LOG_ITER(g_NumIters_Loop++);
{
const ptrdiff_t diff = (ptrdiff_t)0 - (ptrdiff_t)delta;
CLzRef *pair = son + ((size_t)(((ptrdiff_t)pos - CYC_TO_POS_OFFSET) + diff) << 1);
const Byte *len = (len0 < len1 ? len0 : len1);
#ifdef USE_SON_PREFETCH
const UInt32 pair0 = *pair;
#endif
if (len[diff] == len[0])
{
if (++len != lenLimit && len[diff] == len[0])
while (++len != lenLimit)
{
LOG_ITER(g_NumIters_Bytes++);
if (len[diff] != len[0])
break;
}
if (maxLen < len)
{
maxLen = len;
*d++ = (UInt32)(len - cur);
*d++ = delta - 1;
if (len == lenLimit)
{
const UInt32 pair1 = pair[1];
*ptr1 =
#ifdef USE_SON_PREFETCH
pair0;
#else
pair[0];
#endif
*ptr0 = pair1;
_distances[-1] = (UInt32)(d - _distances);
#ifdef USE_LONG_MATCH_OPT
if (hash == size || *hash != delta || lenLimit[diff] != lenLimit[0] || d >= limit)
break;
{
for (;;)
{
hash++;
pos++;
cur++;
lenLimit++;
{
CLzRef *ptr = son + ((size_t)(pos) << 1) - CYC_TO_POS_OFFSET * 2;
#if 0
*(UInt64 *)(void *)ptr = ((const UInt64 *)(const void *)ptr)[diff];
#else
const UInt32 p0 = ptr[0 + (diff * 2)];
const UInt32 p1 = ptr[1 + (diff * 2)];
ptr[0] = p0;
ptr[1] = p1;
// ptr[0] = ptr[0 + (diff * 2)];
// ptr[1] = ptr[1 + (diff * 2)];
#endif
}
// PrintSon(son + 2, pos - 1);
// printf("\npos = %x delta = %x\n", pos, delta);
len++;
*d++ = 2;
*d++ = (UInt32)(len - cur);
*d++ = delta - 1;
if (hash == size || *hash != delta || lenLimit[diff] != lenLimit[0] || d >= limit)
break;
}
}
#endif
break;
}
}
}
{
const UInt32 curMatch = (UInt32)pos - delta; // (UInt32)(pos + diff);
if (len[diff] < len[0])
{
delta = pair[1];
if (delta >= curMatch)
return NULL;
*ptr1 = curMatch;
ptr1 = pair + 1;
len1 = len;
}
else
{
delta = *pair;
if (delta >= curMatch)
return NULL;
*ptr0 = curMatch;
ptr0 = pair;
len0 = len;
}
delta = (UInt32)pos - delta;
if (--cutValue == 0 || delta >= pos)
{
*ptr0 = *ptr1 = kEmptyHashValue;
_distances[-1] = (UInt32)(d - _distances);
break;
}
}
}
} // for (tree iterations)
}
pos++;
cur++;
}
while (d < limit);
*posRes = (UInt32)pos;
return d;
}
*/
/* define cbs if you use 2 functions.
GetMatchesSpecN_1() : (pos < _cyclicBufferSize)
GetMatchesSpecN_2() : (pos >= _cyclicBufferSize)
do not define cbs if you use 1 function:
GetMatchesSpecN_2()
*/
// #define cbs _cyclicBufferSize
/*
we use size_t for (pos) and (_cyclicBufferPos_ instead of UInt32
to eliminate "movsx" BUG in old MSVC x64 compiler.
*/
UInt32 * Z7_FASTCALL GetMatchesSpecN_2(const Byte *lenLimit, size_t pos, const Byte *cur, CLzRef *son,
UInt32 _cutValue, UInt32 *d, size_t _maxLen, const UInt32 *hash, const UInt32 *limit, const UInt32 *size,
size_t _cyclicBufferPos, UInt32 _cyclicBufferSize,
UInt32 *posRes);
Z7_NO_INLINE
UInt32 * Z7_FASTCALL GetMatchesSpecN_2(const Byte *lenLimit, size_t pos, const Byte *cur, CLzRef *son,
UInt32 _cutValue, UInt32 *d, size_t _maxLen, const UInt32 *hash, const UInt32 *limit, const UInt32 *size,
size_t _cyclicBufferPos, UInt32 _cyclicBufferSize,
UInt32 *posRes)
{
do // while (hash != size)
{
UInt32 delta;
#ifndef cbs
UInt32 cbs;
#endif
if (hash == size)
break;
delta = *hash++;
if (delta == 0)
return NULL;
lenLimit++;
#ifndef cbs
cbs = _cyclicBufferSize;
if ((UInt32)pos < cbs)
{
if (delta > (UInt32)pos)
return NULL;
cbs = (UInt32)pos;
}
#endif
if (delta >= cbs)
{
CLzRef *ptr1 = son + ((size_t)_cyclicBufferPos << 1);
*d++ = 0;
ptr1[0] = kEmptyHashValue;
ptr1[1] = kEmptyHashValue;
}
else
{
UInt32 *_distances = ++d;
CLzRef *ptr0 = son + ((size_t)_cyclicBufferPos << 1) + 1;
CLzRef *ptr1 = son + ((size_t)_cyclicBufferPos << 1);
UInt32 cutValue = _cutValue;
const Byte *len0 = cur, *len1 = cur;
const Byte *maxLen = cur + _maxLen;
// if (cutValue == 0) { *ptr0 = *ptr1 = kEmptyHashValue; } else
for (LOG_ITER(g_NumIters_Tree++);;)
{
LOG_ITER(g_NumIters_Loop++);
{
// SPEC code
CLzRef *pair = son + ((size_t)((ptrdiff_t)_cyclicBufferPos - (ptrdiff_t)delta
+ (ptrdiff_t)(UInt32)(_cyclicBufferPos < delta ? cbs : 0)
) << 1);
const ptrdiff_t diff = (ptrdiff_t)0 - (ptrdiff_t)delta;
const Byte *len = (len0 < len1 ? len0 : len1);
#ifdef USE_SON_PREFETCH
const UInt32 pair0 = *pair;
#endif
if (len[diff] == len[0])
{
if (++len != lenLimit && len[diff] == len[0])
while (++len != lenLimit)
{
LOG_ITER(g_NumIters_Bytes++);
if (len[diff] != len[0])
break;
}
if (maxLen < len)
{
maxLen = len;
*d++ = (UInt32)(len - cur);
*d++ = delta - 1;
if (len == lenLimit)
{
const UInt32 pair1 = pair[1];
*ptr1 =
#ifdef USE_SON_PREFETCH
pair0;
#else
pair[0];
#endif
*ptr0 = pair1;
_distances[-1] = (UInt32)(d - _distances);
#ifdef USE_LONG_MATCH_OPT
if (hash == size || *hash != delta || lenLimit[diff] != lenLimit[0] || d >= limit)
break;
{
for (;;)
{
*d++ = 2;
*d++ = (UInt32)(lenLimit - cur);
*d++ = delta - 1;
cur++;
lenLimit++;
// SPEC
_cyclicBufferPos++;
{
// SPEC code
CLzRef *dest = son + ((size_t)(_cyclicBufferPos) << 1);
const CLzRef *src = dest + ((diff
+ (ptrdiff_t)(UInt32)((_cyclicBufferPos < delta) ? cbs : 0)) << 1);
// CLzRef *ptr = son + ((size_t)(pos) << 1) - CYC_TO_POS_OFFSET * 2;
#if 0
*(UInt64 *)(void *)dest = *((const UInt64 *)(const void *)src);
#else
const UInt32 p0 = src[0];
const UInt32 p1 = src[1];
dest[0] = p0;
dest[1] = p1;
#endif
}
pos++;
hash++;
if (hash == size || *hash != delta || lenLimit[diff] != lenLimit[0] || d >= limit)
break;
} // for() end for long matches
}
#endif
break; // break from TREE iterations
}
}
}
{
const UInt32 curMatch = (UInt32)pos - delta; // (UInt32)(pos + diff);
if (len[diff] < len[0])
{
delta = pair[1];
*ptr1 = curMatch;
ptr1 = pair + 1;
len1 = len;
if (delta >= curMatch)
return NULL;
}
else
{
delta = *pair;
*ptr0 = curMatch;
ptr0 = pair;
len0 = len;
if (delta >= curMatch)
return NULL;
}
delta = (UInt32)pos - delta;
if (--cutValue == 0 || delta >= cbs)
{
*ptr0 = *ptr1 = kEmptyHashValue;
_distances[-1] = (UInt32)(d - _distances);
break;
}
}
}
} // for (tree iterations)
}
pos++;
_cyclicBufferPos++;
cur++;
}
while (d < limit);
*posRes = (UInt32)pos;
return d;
}
/*
typedef UInt32 uint32plus; // size_t
UInt32 * Z7_FASTCALL GetMatchesSpecN_3(uint32plus lenLimit, size_t pos, const Byte *cur, CLzRef *son,
UInt32 _cutValue, UInt32 *d, uint32plus _maxLen, const UInt32 *hash, const UInt32 *limit, const UInt32 *size,
size_t _cyclicBufferPos, UInt32 _cyclicBufferSize,
UInt32 *posRes)
{
do // while (hash != size)
{
UInt32 delta;
#ifndef cbs
UInt32 cbs;
#endif
if (hash == size)
break;
delta = *hash++;
if (delta == 0)
return NULL;
#ifndef cbs
cbs = _cyclicBufferSize;
if ((UInt32)pos < cbs)
{
if (delta > (UInt32)pos)
return NULL;
cbs = (UInt32)pos;
}
#endif
if (delta >= cbs)
{
CLzRef *ptr1 = son + ((size_t)_cyclicBufferPos << 1);
*d++ = 0;
ptr1[0] = kEmptyHashValue;
ptr1[1] = kEmptyHashValue;
}
else
{
CLzRef *ptr0 = son + ((size_t)_cyclicBufferPos << 1) + 1;
CLzRef *ptr1 = son + ((size_t)_cyclicBufferPos << 1);
UInt32 *_distances = ++d;
uint32plus len0 = 0, len1 = 0;
UInt32 cutValue = _cutValue;
uint32plus maxLen = _maxLen;
// lenLimit++; // const Byte *lenLimit = cur + _lenLimit;
for (LOG_ITER(g_NumIters_Tree++);;)
{
LOG_ITER(g_NumIters_Loop++);
{
// const ptrdiff_t diff = (ptrdiff_t)0 - (ptrdiff_t)delta;
CLzRef *pair = son + ((size_t)((ptrdiff_t)_cyclicBufferPos - delta
+ (ptrdiff_t)(UInt32)(_cyclicBufferPos < delta ? cbs : 0)
) << 1);
const Byte *pb = cur - delta;
uint32plus len = (len0 < len1 ? len0 : len1);
#ifdef USE_SON_PREFETCH
const UInt32 pair0 = *pair;
#endif
if (pb[len] == cur[len])
{
if (++len != lenLimit && pb[len] == cur[len])
while (++len != lenLimit)
if (pb[len] != cur[len])
break;
if (maxLen < len)
{
maxLen = len;
*d++ = (UInt32)len;
*d++ = delta - 1;
if (len == lenLimit)
{
{
const UInt32 pair1 = pair[1];
*ptr0 = pair1;
*ptr1 =
#ifdef USE_SON_PREFETCH
pair0;
#else
pair[0];
#endif
}
_distances[-1] = (UInt32)(d - _distances);
#ifdef USE_LONG_MATCH_OPT
if (hash == size || *hash != delta || pb[lenLimit] != cur[lenLimit] || d >= limit)
break;
{
const ptrdiff_t diff = (ptrdiff_t)0 - (ptrdiff_t)delta;
for (;;)
{
*d++ = 2;
*d++ = (UInt32)lenLimit;
*d++ = delta - 1;
_cyclicBufferPos++;
{
CLzRef *dest = son + ((size_t)_cyclicBufferPos << 1);
const CLzRef *src = dest + ((diff +
(ptrdiff_t)(UInt32)(_cyclicBufferPos < delta ? cbs : 0)) << 1);
#if 0
*(UInt64 *)(void *)dest = *((const UInt64 *)(const void *)src);
#else
const UInt32 p0 = src[0];
const UInt32 p1 = src[1];
dest[0] = p0;
dest[1] = p1;
#endif
}
hash++;
pos++;
cur++;
pb++;
if (hash == size || *hash != delta || pb[lenLimit] != cur[lenLimit] || d >= limit)
break;
}
}
#endif
break;
}
}
}
{
const UInt32 curMatch = (UInt32)pos - delta;
if (pb[len] < cur[len])
{
delta = pair[1];
*ptr1 = curMatch;
ptr1 = pair + 1;
len1 = len;
}
else
{
delta = *pair;
*ptr0 = curMatch;
ptr0 = pair;
len0 = len;
}
{
if (delta >= curMatch)
return NULL;
delta = (UInt32)pos - delta;
if (delta >= cbs
// delta >= _cyclicBufferSize || delta >= pos
|| --cutValue == 0)
{
*ptr0 = *ptr1 = kEmptyHashValue;
_distances[-1] = (UInt32)(d - _distances);
break;
}
}
}
}
} // for (tree iterations)
}
pos++;
_cyclicBufferPos++;
cur++;
}
while (d < limit);
*posRes = (UInt32)pos;
return d;
}
*/

View File

@@ -1,54 +1,34 @@
/* LzHash.h -- HASH functions for LZ algorithms
2008-10-04 : Igor Pavlov : Public domain */
/* LzHash.h -- HASH constants for LZ algorithms
2023-03-05 : Igor Pavlov : Public domain */
#ifndef __LZHASH_H
#define __LZHASH_H
#ifndef ZIP7_INC_LZ_HASH_H
#define ZIP7_INC_LZ_HASH_H
/*
(kHash2Size >= (1 << 8)) : Required
(kHash3Size >= (1 << 16)) : Required
*/
#define kHash2Size (1 << 10)
#define kHash3Size (1 << 16)
#define kHash4Size (1 << 20)
// #define kHash4Size (1 << 20)
#define kFix3HashSize (kHash2Size)
#define kFix4HashSize (kHash2Size + kHash3Size)
#define kFix5HashSize (kHash2Size + kHash3Size + kHash4Size)
// #define kFix5HashSize (kHash2Size + kHash3Size + kHash4Size)
#define HASH2_CALC hashValue = cur[0] | ((UInt32)cur[1] << 8);
/*
We use up to 3 crc values for hash:
crc0
crc1 << Shift_1
crc2 << Shift_2
(Shift_1 = 5) and (Shift_2 = 10) is good tradeoff.
Small values for Shift are not good for collision rate.
Big value for Shift_2 increases the minimum size
of hash table, that will be slow for small files.
*/
#define HASH3_CALC { \
UInt32 temp = p->crc[cur[0]] ^ cur[1]; \
hash2Value = temp & (kHash2Size - 1); \
hashValue = (temp ^ ((UInt32)cur[2] << 8)) & p->hashMask; }
#define HASH4_CALC { \
UInt32 temp = p->crc[cur[0]] ^ cur[1]; \
hash2Value = temp & (kHash2Size - 1); \
hash3Value = (temp ^ ((UInt32)cur[2] << 8)) & (kHash3Size - 1); \
hashValue = (temp ^ ((UInt32)cur[2] << 8) ^ (p->crc[cur[3]] << 5)) & p->hashMask; }
#define HASH5_CALC { \
UInt32 temp = p->crc[cur[0]] ^ cur[1]; \
hash2Value = temp & (kHash2Size - 1); \
hash3Value = (temp ^ ((UInt32)cur[2] << 8)) & (kHash3Size - 1); \
hash4Value = (temp ^ ((UInt32)cur[2] << 8) ^ (p->crc[cur[3]] << 5)); \
hashValue = (hash4Value ^ (p->crc[cur[4]] << 3)) & p->hashMask; \
hash4Value &= (kHash4Size - 1); }
/* #define HASH_ZIP_CALC hashValue = ((cur[0] | ((UInt32)cur[1] << 8)) ^ p->crc[cur[2]]) & 0xFFFF; */
#define HASH_ZIP_CALC hashValue = ((cur[2] | ((UInt32)cur[0] << 8)) ^ p->crc[cur[1]]) & 0xFFFF;
#define MT_HASH2_CALC \
hash2Value = (p->crc[cur[0]] ^ cur[1]) & (kHash2Size - 1);
#define MT_HASH3_CALC { \
UInt32 temp = p->crc[cur[0]] ^ cur[1]; \
hash2Value = temp & (kHash2Size - 1); \
hash3Value = (temp ^ ((UInt32)cur[2] << 8)) & (kHash3Size - 1); }
#define MT_HASH4_CALC { \
UInt32 temp = p->crc[cur[0]] ^ cur[1]; \
hash2Value = temp & (kHash2Size - 1); \
hash3Value = (temp ^ ((UInt32)cur[2] << 8)) & (kHash3Size - 1); \
hash4Value = (temp ^ ((UInt32)cur[2] << 8) ^ (p->crc[cur[3]] << 5)) & (kHash4Size - 1); }
#define kLzHash_CrcShift_1 5
#define kLzHash_CrcShift_2 10
#endif

491
C/Lzma2Dec.c Executable file
View File

@@ -0,0 +1,491 @@
/* Lzma2Dec.c -- LZMA2 Decoder
2023-03-03 : Igor Pavlov : Public domain */
/* #define SHOW_DEBUG_INFO */
#include "Precomp.h"
#ifdef SHOW_DEBUG_INFO
#include <stdio.h>
#endif
#include <string.h>
#include "Lzma2Dec.h"
/*
00000000 - End of data
00000001 U U - Uncompressed, reset dic, need reset state and set new prop
00000010 U U - Uncompressed, no reset
100uuuuu U U P P - LZMA, no reset
101uuuuu U U P P - LZMA, reset state
110uuuuu U U P P S - LZMA, reset state + set new prop
111uuuuu U U P P S - LZMA, reset state + set new prop, reset dic
u, U - Unpack Size
P - Pack Size
S - Props
*/
#define LZMA2_CONTROL_COPY_RESET_DIC 1
#define LZMA2_IS_UNCOMPRESSED_STATE(p) (((p)->control & (1 << 7)) == 0)
#define LZMA2_LCLP_MAX 4
#define LZMA2_DIC_SIZE_FROM_PROP(p) (((UInt32)2 | ((p) & 1)) << ((p) / 2 + 11))
#ifdef SHOW_DEBUG_INFO
#define PRF(x) x
#else
#define PRF(x)
#endif
typedef enum
{
LZMA2_STATE_CONTROL,
LZMA2_STATE_UNPACK0,
LZMA2_STATE_UNPACK1,
LZMA2_STATE_PACK0,
LZMA2_STATE_PACK1,
LZMA2_STATE_PROP,
LZMA2_STATE_DATA,
LZMA2_STATE_DATA_CONT,
LZMA2_STATE_FINISHED,
LZMA2_STATE_ERROR
} ELzma2State;
static SRes Lzma2Dec_GetOldProps(Byte prop, Byte *props)
{
UInt32 dicSize;
if (prop > 40)
return SZ_ERROR_UNSUPPORTED;
dicSize = (prop == 40) ? 0xFFFFFFFF : LZMA2_DIC_SIZE_FROM_PROP(prop);
props[0] = (Byte)LZMA2_LCLP_MAX;
props[1] = (Byte)(dicSize);
props[2] = (Byte)(dicSize >> 8);
props[3] = (Byte)(dicSize >> 16);
props[4] = (Byte)(dicSize >> 24);
return SZ_OK;
}
SRes Lzma2Dec_AllocateProbs(CLzma2Dec *p, Byte prop, ISzAllocPtr alloc)
{
Byte props[LZMA_PROPS_SIZE];
RINOK(Lzma2Dec_GetOldProps(prop, props))
return LzmaDec_AllocateProbs(&p->decoder, props, LZMA_PROPS_SIZE, alloc);
}
SRes Lzma2Dec_Allocate(CLzma2Dec *p, Byte prop, ISzAllocPtr alloc)
{
Byte props[LZMA_PROPS_SIZE];
RINOK(Lzma2Dec_GetOldProps(prop, props))
return LzmaDec_Allocate(&p->decoder, props, LZMA_PROPS_SIZE, alloc);
}
void Lzma2Dec_Init(CLzma2Dec *p)
{
p->state = LZMA2_STATE_CONTROL;
p->needInitLevel = 0xE0;
p->isExtraMode = False;
p->unpackSize = 0;
// p->decoder.dicPos = 0; // we can use it instead of full init
LzmaDec_Init(&p->decoder);
}
// ELzma2State
static unsigned Lzma2Dec_UpdateState(CLzma2Dec *p, Byte b)
{
switch (p->state)
{
case LZMA2_STATE_CONTROL:
p->isExtraMode = False;
p->control = b;
PRF(printf("\n %8X", (unsigned)p->decoder.dicPos));
PRF(printf(" %02X", (unsigned)b));
if (b == 0)
return LZMA2_STATE_FINISHED;
if (LZMA2_IS_UNCOMPRESSED_STATE(p))
{
if (b == LZMA2_CONTROL_COPY_RESET_DIC)
p->needInitLevel = 0xC0;
else if (b > 2 || p->needInitLevel == 0xE0)
return LZMA2_STATE_ERROR;
}
else
{
if (b < p->needInitLevel)
return LZMA2_STATE_ERROR;
p->needInitLevel = 0;
p->unpackSize = (UInt32)(b & 0x1F) << 16;
}
return LZMA2_STATE_UNPACK0;
case LZMA2_STATE_UNPACK0:
p->unpackSize |= (UInt32)b << 8;
return LZMA2_STATE_UNPACK1;
case LZMA2_STATE_UNPACK1:
p->unpackSize |= (UInt32)b;
p->unpackSize++;
PRF(printf(" %7u", (unsigned)p->unpackSize));
return LZMA2_IS_UNCOMPRESSED_STATE(p) ? LZMA2_STATE_DATA : LZMA2_STATE_PACK0;
case LZMA2_STATE_PACK0:
p->packSize = (UInt32)b << 8;
return LZMA2_STATE_PACK1;
case LZMA2_STATE_PACK1:
p->packSize |= (UInt32)b;
p->packSize++;
// if (p->packSize < 5) return LZMA2_STATE_ERROR;
PRF(printf(" %5u", (unsigned)p->packSize));
return (p->control & 0x40) ? LZMA2_STATE_PROP : LZMA2_STATE_DATA;
case LZMA2_STATE_PROP:
{
unsigned lc, lp;
if (b >= (9 * 5 * 5))
return LZMA2_STATE_ERROR;
lc = b % 9;
b /= 9;
p->decoder.prop.pb = (Byte)(b / 5);
lp = b % 5;
if (lc + lp > LZMA2_LCLP_MAX)
return LZMA2_STATE_ERROR;
p->decoder.prop.lc = (Byte)lc;
p->decoder.prop.lp = (Byte)lp;
return LZMA2_STATE_DATA;
}
}
return LZMA2_STATE_ERROR;
}
static void LzmaDec_UpdateWithUncompressed(CLzmaDec *p, const Byte *src, SizeT size)
{
memcpy(p->dic + p->dicPos, src, size);
p->dicPos += size;
if (p->checkDicSize == 0 && p->prop.dicSize - p->processedPos <= size)
p->checkDicSize = p->prop.dicSize;
p->processedPos += (UInt32)size;
}
void LzmaDec_InitDicAndState(CLzmaDec *p, BoolInt initDic, BoolInt initState);
SRes Lzma2Dec_DecodeToDic(CLzma2Dec *p, SizeT dicLimit,
const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode, ELzmaStatus *status)
{
SizeT inSize = *srcLen;
*srcLen = 0;
*status = LZMA_STATUS_NOT_SPECIFIED;
while (p->state != LZMA2_STATE_ERROR)
{
SizeT dicPos;
if (p->state == LZMA2_STATE_FINISHED)
{
*status = LZMA_STATUS_FINISHED_WITH_MARK;
return SZ_OK;
}
dicPos = p->decoder.dicPos;
if (dicPos == dicLimit && finishMode == LZMA_FINISH_ANY)
{
*status = LZMA_STATUS_NOT_FINISHED;
return SZ_OK;
}
if (p->state != LZMA2_STATE_DATA && p->state != LZMA2_STATE_DATA_CONT)
{
if (*srcLen == inSize)
{
*status = LZMA_STATUS_NEEDS_MORE_INPUT;
return SZ_OK;
}
(*srcLen)++;
p->state = Lzma2Dec_UpdateState(p, *src++);
if (dicPos == dicLimit && p->state != LZMA2_STATE_FINISHED)
break;
continue;
}
{
SizeT inCur = inSize - *srcLen;
SizeT outCur = dicLimit - dicPos;
ELzmaFinishMode curFinishMode = LZMA_FINISH_ANY;
if (outCur >= p->unpackSize)
{
outCur = (SizeT)p->unpackSize;
curFinishMode = LZMA_FINISH_END;
}
if (LZMA2_IS_UNCOMPRESSED_STATE(p))
{
if (inCur == 0)
{
*status = LZMA_STATUS_NEEDS_MORE_INPUT;
return SZ_OK;
}
if (p->state == LZMA2_STATE_DATA)
{
BoolInt initDic = (p->control == LZMA2_CONTROL_COPY_RESET_DIC);
LzmaDec_InitDicAndState(&p->decoder, initDic, False);
}
if (inCur > outCur)
inCur = outCur;
if (inCur == 0)
break;
LzmaDec_UpdateWithUncompressed(&p->decoder, src, inCur);
src += inCur;
*srcLen += inCur;
p->unpackSize -= (UInt32)inCur;
p->state = (p->unpackSize == 0) ? LZMA2_STATE_CONTROL : LZMA2_STATE_DATA_CONT;
}
else
{
SRes res;
if (p->state == LZMA2_STATE_DATA)
{
BoolInt initDic = (p->control >= 0xE0);
BoolInt initState = (p->control >= 0xA0);
LzmaDec_InitDicAndState(&p->decoder, initDic, initState);
p->state = LZMA2_STATE_DATA_CONT;
}
if (inCur > p->packSize)
inCur = (SizeT)p->packSize;
res = LzmaDec_DecodeToDic(&p->decoder, dicPos + outCur, src, &inCur, curFinishMode, status);
src += inCur;
*srcLen += inCur;
p->packSize -= (UInt32)inCur;
outCur = p->decoder.dicPos - dicPos;
p->unpackSize -= (UInt32)outCur;
if (res != 0)
break;
if (*status == LZMA_STATUS_NEEDS_MORE_INPUT)
{
if (p->packSize == 0)
break;
return SZ_OK;
}
if (inCur == 0 && outCur == 0)
{
if (*status != LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK
|| p->unpackSize != 0
|| p->packSize != 0)
break;
p->state = LZMA2_STATE_CONTROL;
}
*status = LZMA_STATUS_NOT_SPECIFIED;
}
}
}
*status = LZMA_STATUS_NOT_SPECIFIED;
p->state = LZMA2_STATE_ERROR;
return SZ_ERROR_DATA;
}
ELzma2ParseStatus Lzma2Dec_Parse(CLzma2Dec *p,
SizeT outSize,
const Byte *src, SizeT *srcLen,
int checkFinishBlock)
{
SizeT inSize = *srcLen;
*srcLen = 0;
while (p->state != LZMA2_STATE_ERROR)
{
if (p->state == LZMA2_STATE_FINISHED)
return (ELzma2ParseStatus)LZMA_STATUS_FINISHED_WITH_MARK;
if (outSize == 0 && !checkFinishBlock)
return (ELzma2ParseStatus)LZMA_STATUS_NOT_FINISHED;
if (p->state != LZMA2_STATE_DATA && p->state != LZMA2_STATE_DATA_CONT)
{
if (*srcLen == inSize)
return (ELzma2ParseStatus)LZMA_STATUS_NEEDS_MORE_INPUT;
(*srcLen)++;
p->state = Lzma2Dec_UpdateState(p, *src++);
if (p->state == LZMA2_STATE_UNPACK0)
{
// if (p->decoder.dicPos != 0)
if (p->control == LZMA2_CONTROL_COPY_RESET_DIC || p->control >= 0xE0)
return LZMA2_PARSE_STATUS_NEW_BLOCK;
// if (outSize == 0) return LZMA_STATUS_NOT_FINISHED;
}
// The following code can be commented.
// It's not big problem, if we read additional input bytes.
// It will be stopped later in LZMA2_STATE_DATA / LZMA2_STATE_DATA_CONT state.
if (outSize == 0 && p->state != LZMA2_STATE_FINISHED)
{
// checkFinishBlock is true. So we expect that block must be finished,
// We can return LZMA_STATUS_NOT_SPECIFIED or LZMA_STATUS_NOT_FINISHED here
// break;
return (ELzma2ParseStatus)LZMA_STATUS_NOT_FINISHED;
}
if (p->state == LZMA2_STATE_DATA)
return LZMA2_PARSE_STATUS_NEW_CHUNK;
continue;
}
if (outSize == 0)
return (ELzma2ParseStatus)LZMA_STATUS_NOT_FINISHED;
{
SizeT inCur = inSize - *srcLen;
if (LZMA2_IS_UNCOMPRESSED_STATE(p))
{
if (inCur == 0)
return (ELzma2ParseStatus)LZMA_STATUS_NEEDS_MORE_INPUT;
if (inCur > p->unpackSize)
inCur = p->unpackSize;
if (inCur > outSize)
inCur = outSize;
p->decoder.dicPos += inCur;
src += inCur;
*srcLen += inCur;
outSize -= inCur;
p->unpackSize -= (UInt32)inCur;
p->state = (p->unpackSize == 0) ? LZMA2_STATE_CONTROL : LZMA2_STATE_DATA_CONT;
}
else
{
p->isExtraMode = True;
if (inCur == 0)
{
if (p->packSize != 0)
return (ELzma2ParseStatus)LZMA_STATUS_NEEDS_MORE_INPUT;
}
else if (p->state == LZMA2_STATE_DATA)
{
p->state = LZMA2_STATE_DATA_CONT;
if (*src != 0)
{
// first byte of lzma chunk must be Zero
*srcLen += 1;
p->packSize--;
break;
}
}
if (inCur > p->packSize)
inCur = (SizeT)p->packSize;
src += inCur;
*srcLen += inCur;
p->packSize -= (UInt32)inCur;
if (p->packSize == 0)
{
SizeT rem = outSize;
if (rem > p->unpackSize)
rem = p->unpackSize;
p->decoder.dicPos += rem;
p->unpackSize -= (UInt32)rem;
outSize -= rem;
if (p->unpackSize == 0)
p->state = LZMA2_STATE_CONTROL;
}
}
}
}
p->state = LZMA2_STATE_ERROR;
return (ELzma2ParseStatus)LZMA_STATUS_NOT_SPECIFIED;
}
SRes Lzma2Dec_DecodeToBuf(CLzma2Dec *p, Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode, ELzmaStatus *status)
{
SizeT outSize = *destLen, inSize = *srcLen;
*srcLen = *destLen = 0;
for (;;)
{
SizeT inCur = inSize, outCur, dicPos;
ELzmaFinishMode curFinishMode;
SRes res;
if (p->decoder.dicPos == p->decoder.dicBufSize)
p->decoder.dicPos = 0;
dicPos = p->decoder.dicPos;
curFinishMode = LZMA_FINISH_ANY;
outCur = p->decoder.dicBufSize - dicPos;
if (outCur >= outSize)
{
outCur = outSize;
curFinishMode = finishMode;
}
res = Lzma2Dec_DecodeToDic(p, dicPos + outCur, src, &inCur, curFinishMode, status);
src += inCur;
inSize -= inCur;
*srcLen += inCur;
outCur = p->decoder.dicPos - dicPos;
memcpy(dest, p->decoder.dic + dicPos, outCur);
dest += outCur;
outSize -= outCur;
*destLen += outCur;
if (res != 0)
return res;
if (outCur == 0 || outSize == 0)
return SZ_OK;
}
}
SRes Lzma2Decode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen,
Byte prop, ELzmaFinishMode finishMode, ELzmaStatus *status, ISzAllocPtr alloc)
{
CLzma2Dec p;
SRes res;
SizeT outSize = *destLen, inSize = *srcLen;
*destLen = *srcLen = 0;
*status = LZMA_STATUS_NOT_SPECIFIED;
Lzma2Dec_CONSTRUCT(&p)
RINOK(Lzma2Dec_AllocateProbs(&p, prop, alloc))
p.decoder.dic = dest;
p.decoder.dicBufSize = outSize;
Lzma2Dec_Init(&p);
*srcLen = inSize;
res = Lzma2Dec_DecodeToDic(&p, outSize, src, srcLen, finishMode, status);
*destLen = p.decoder.dicPos;
if (res == SZ_OK && *status == LZMA_STATUS_NEEDS_MORE_INPUT)
res = SZ_ERROR_INPUT_EOF;
Lzma2Dec_FreeProbs(&p, alloc);
return res;
}
#undef PRF

121
C/Lzma2Dec.h Executable file
View File

@@ -0,0 +1,121 @@
/* Lzma2Dec.h -- LZMA2 Decoder
2023-03-03 : Igor Pavlov : Public domain */
#ifndef ZIP7_INC_LZMA2_DEC_H
#define ZIP7_INC_LZMA2_DEC_H
#include "LzmaDec.h"
EXTERN_C_BEGIN
/* ---------- State Interface ---------- */
typedef struct
{
unsigned state;
Byte control;
Byte needInitLevel;
Byte isExtraMode;
Byte _pad_;
UInt32 packSize;
UInt32 unpackSize;
CLzmaDec decoder;
} CLzma2Dec;
#define Lzma2Dec_CONSTRUCT(p) LzmaDec_CONSTRUCT(&(p)->decoder)
#define Lzma2Dec_Construct(p) Lzma2Dec_CONSTRUCT(p)
#define Lzma2Dec_FreeProbs(p, alloc) LzmaDec_FreeProbs(&(p)->decoder, alloc)
#define Lzma2Dec_Free(p, alloc) LzmaDec_Free(&(p)->decoder, alloc)
SRes Lzma2Dec_AllocateProbs(CLzma2Dec *p, Byte prop, ISzAllocPtr alloc);
SRes Lzma2Dec_Allocate(CLzma2Dec *p, Byte prop, ISzAllocPtr alloc);
void Lzma2Dec_Init(CLzma2Dec *p);
/*
finishMode:
It has meaning only if the decoding reaches output limit (*destLen or dicLimit).
LZMA_FINISH_ANY - use smallest number of input bytes
LZMA_FINISH_END - read EndOfStream marker after decoding
Returns:
SZ_OK
status:
LZMA_STATUS_FINISHED_WITH_MARK
LZMA_STATUS_NOT_FINISHED
LZMA_STATUS_NEEDS_MORE_INPUT
SZ_ERROR_DATA - Data error
*/
SRes Lzma2Dec_DecodeToDic(CLzma2Dec *p, SizeT dicLimit,
const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode, ELzmaStatus *status);
SRes Lzma2Dec_DecodeToBuf(CLzma2Dec *p, Byte *dest, SizeT *destLen,
const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode, ELzmaStatus *status);
/* ---------- LZMA2 block and chunk parsing ---------- */
/*
Lzma2Dec_Parse() parses compressed data stream up to next independent block or next chunk data.
It can return LZMA_STATUS_* code or LZMA2_PARSE_STATUS_* code:
- LZMA2_PARSE_STATUS_NEW_BLOCK - there is new block, and 1 additional byte (control byte of next block header) was read from input.
- LZMA2_PARSE_STATUS_NEW_CHUNK - there is new chunk, and only lzma2 header of new chunk was read.
CLzma2Dec::unpackSize contains unpack size of that chunk
*/
typedef enum
{
/*
LZMA_STATUS_NOT_SPECIFIED // data error
LZMA_STATUS_FINISHED_WITH_MARK
LZMA_STATUS_NOT_FINISHED //
LZMA_STATUS_NEEDS_MORE_INPUT
LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK // unused
*/
LZMA2_PARSE_STATUS_NEW_BLOCK = LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK + 1,
LZMA2_PARSE_STATUS_NEW_CHUNK
} ELzma2ParseStatus;
ELzma2ParseStatus Lzma2Dec_Parse(CLzma2Dec *p,
SizeT outSize, // output size
const Byte *src, SizeT *srcLen,
int checkFinishBlock // set (checkFinishBlock = 1), if it must read full input data, if decoder.dicPos reaches blockMax position.
);
/*
LZMA2 parser doesn't decode LZMA chunks, so we must read
full input LZMA chunk to decode some part of LZMA chunk.
Lzma2Dec_GetUnpackExtra() returns the value that shows
max possible number of output bytes that can be output by decoder
at current input positon.
*/
#define Lzma2Dec_GetUnpackExtra(p) ((p)->isExtraMode ? (p)->unpackSize : 0)
/* ---------- One Call Interface ---------- */
/*
finishMode:
It has meaning only if the decoding reaches output limit (*destLen).
LZMA_FINISH_ANY - use smallest number of input bytes
LZMA_FINISH_END - read EndOfStream marker after decoding
Returns:
SZ_OK
status:
LZMA_STATUS_FINISHED_WITH_MARK
LZMA_STATUS_NOT_FINISHED
SZ_ERROR_DATA - Data error
SZ_ERROR_MEM - Memory allocation error
SZ_ERROR_UNSUPPORTED - Unsupported properties
SZ_ERROR_INPUT_EOF - It needs more bytes in input buffer (src).
*/
SRes Lzma2Decode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen,
Byte prop, ELzmaFinishMode finishMode, ELzmaStatus *status, ISzAllocPtr alloc);
EXTERN_C_END
#endif

1095
C/Lzma2DecMt.c Executable file
View File

File diff suppressed because it is too large Load Diff

81
C/Lzma2DecMt.h Executable file
View File

@@ -0,0 +1,81 @@
/* Lzma2DecMt.h -- LZMA2 Decoder Multi-thread
2023-04-13 : Igor Pavlov : Public domain */
#ifndef ZIP7_INC_LZMA2_DEC_MT_H
#define ZIP7_INC_LZMA2_DEC_MT_H
#include "7zTypes.h"
EXTERN_C_BEGIN
typedef struct
{
size_t inBufSize_ST;
size_t outStep_ST;
#ifndef Z7_ST
unsigned numThreads;
size_t inBufSize_MT;
size_t outBlockMax;
size_t inBlockMax;
#endif
} CLzma2DecMtProps;
/* init to single-thread mode */
void Lzma2DecMtProps_Init(CLzma2DecMtProps *p);
/* ---------- CLzma2DecMtHandle Interface ---------- */
/* Lzma2DecMt_ * functions can return the following exit codes:
SRes:
SZ_OK - OK
SZ_ERROR_MEM - Memory allocation error
SZ_ERROR_PARAM - Incorrect paramater in props
SZ_ERROR_WRITE - ISeqOutStream write callback error
// SZ_ERROR_OUTPUT_EOF - output buffer overflow - version with (Byte *) output
SZ_ERROR_PROGRESS - some break from progress callback
SZ_ERROR_THREAD - error in multithreading functions (only for Mt version)
*/
typedef struct CLzma2DecMt CLzma2DecMt;
typedef CLzma2DecMt * CLzma2DecMtHandle;
// Z7_DECLARE_HANDLE(CLzma2DecMtHandle)
CLzma2DecMtHandle Lzma2DecMt_Create(ISzAllocPtr alloc, ISzAllocPtr allocMid);
void Lzma2DecMt_Destroy(CLzma2DecMtHandle p);
SRes Lzma2DecMt_Decode(CLzma2DecMtHandle p,
Byte prop,
const CLzma2DecMtProps *props,
ISeqOutStreamPtr outStream,
const UInt64 *outDataSize, // NULL means undefined
int finishMode, // 0 - partial unpacking is allowed, 1 - if lzma2 stream must be finished
// Byte *outBuf, size_t *outBufSize,
ISeqInStreamPtr inStream,
// const Byte *inData, size_t inDataSize,
// out variables:
UInt64 *inProcessed,
int *isMT, /* out: (*isMT == 0), if single thread decoding was used */
// UInt64 *outProcessed,
ICompressProgressPtr progress);
/* ---------- Read from CLzma2DecMtHandle Interface ---------- */
SRes Lzma2DecMt_Init(CLzma2DecMtHandle pp,
Byte prop,
const CLzma2DecMtProps *props,
const UInt64 *outDataSize, int finishMode,
ISeqInStreamPtr inStream);
SRes Lzma2DecMt_Read(CLzma2DecMtHandle pp,
Byte *data, size_t *outSize,
UInt64 *inStreamProcessed);
EXTERN_C_END
#endif

805
C/Lzma2Enc.c Executable file
View File

@@ -0,0 +1,805 @@
/* Lzma2Enc.c -- LZMA2 Encoder
2023-04-13 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include <string.h>
/* #define Z7_ST */
#include "Lzma2Enc.h"
#ifndef Z7_ST
#include "MtCoder.h"
#else
#define MTCODER_THREADS_MAX 1
#endif
#define LZMA2_CONTROL_LZMA (1 << 7)
#define LZMA2_CONTROL_COPY_NO_RESET 2
#define LZMA2_CONTROL_COPY_RESET_DIC 1
#define LZMA2_CONTROL_EOF 0
#define LZMA2_LCLP_MAX 4
#define LZMA2_DIC_SIZE_FROM_PROP(p) (((UInt32)2 | ((p) & 1)) << ((p) / 2 + 11))
#define LZMA2_PACK_SIZE_MAX (1 << 16)
#define LZMA2_COPY_CHUNK_SIZE LZMA2_PACK_SIZE_MAX
#define LZMA2_UNPACK_SIZE_MAX (1 << 21)
#define LZMA2_KEEP_WINDOW_SIZE LZMA2_UNPACK_SIZE_MAX
#define LZMA2_CHUNK_SIZE_COMPRESSED_MAX ((1 << 16) + 16)
#define PRF(x) /* x */
/* ---------- CLimitedSeqInStream ---------- */
typedef struct
{
ISeqInStream vt;
ISeqInStreamPtr realStream;
UInt64 limit;
UInt64 processed;
int finished;
} CLimitedSeqInStream;
static void LimitedSeqInStream_Init(CLimitedSeqInStream *p)
{
p->limit = (UInt64)(Int64)-1;
p->processed = 0;
p->finished = 0;
}
static SRes LimitedSeqInStream_Read(ISeqInStreamPtr pp, void *data, size_t *size)
{
Z7_CONTAINER_FROM_VTBL_TO_DECL_VAR_pp_vt_p(CLimitedSeqInStream)
size_t size2 = *size;
SRes res = SZ_OK;
if (p->limit != (UInt64)(Int64)-1)
{
const UInt64 rem = p->limit - p->processed;
if (size2 > rem)
size2 = (size_t)rem;
}
if (size2 != 0)
{
res = ISeqInStream_Read(p->realStream, data, &size2);
p->finished = (size2 == 0 ? 1 : 0);
p->processed += size2;
}
*size = size2;
return res;
}
/* ---------- CLzma2EncInt ---------- */
typedef struct
{
CLzmaEncHandle enc;
Byte propsAreSet;
Byte propsByte;
Byte needInitState;
Byte needInitProp;
UInt64 srcPos;
} CLzma2EncInt;
static SRes Lzma2EncInt_InitStream(CLzma2EncInt *p, const CLzma2EncProps *props)
{
if (!p->propsAreSet)
{
SizeT propsSize = LZMA_PROPS_SIZE;
Byte propsEncoded[LZMA_PROPS_SIZE];
RINOK(LzmaEnc_SetProps(p->enc, &props->lzmaProps))
RINOK(LzmaEnc_WriteProperties(p->enc, propsEncoded, &propsSize))
p->propsByte = propsEncoded[0];
p->propsAreSet = True;
}
return SZ_OK;
}
static void Lzma2EncInt_InitBlock(CLzma2EncInt *p)
{
p->srcPos = 0;
p->needInitState = True;
p->needInitProp = True;
}
SRes LzmaEnc_PrepareForLzma2(CLzmaEncHandle p, ISeqInStreamPtr inStream, UInt32 keepWindowSize,
ISzAllocPtr alloc, ISzAllocPtr allocBig);
SRes LzmaEnc_MemPrepare(CLzmaEncHandle p, const Byte *src, SizeT srcLen,
UInt32 keepWindowSize, ISzAllocPtr alloc, ISzAllocPtr allocBig);
SRes LzmaEnc_CodeOneMemBlock(CLzmaEncHandle p, BoolInt reInit,
Byte *dest, size_t *destLen, UInt32 desiredPackSize, UInt32 *unpackSize);
const Byte *LzmaEnc_GetCurBuf(CLzmaEncHandle p);
void LzmaEnc_Finish(CLzmaEncHandle p);
void LzmaEnc_SaveState(CLzmaEncHandle p);
void LzmaEnc_RestoreState(CLzmaEncHandle p);
/*
UInt32 LzmaEnc_GetNumAvailableBytes(CLzmaEncHandle p);
*/
static SRes Lzma2EncInt_EncodeSubblock(CLzma2EncInt *p, Byte *outBuf,
size_t *packSizeRes, ISeqOutStreamPtr outStream)
{
size_t packSizeLimit = *packSizeRes;
size_t packSize = packSizeLimit;
UInt32 unpackSize = LZMA2_UNPACK_SIZE_MAX;
unsigned lzHeaderSize = 5 + (p->needInitProp ? 1 : 0);
BoolInt useCopyBlock;
SRes res;
*packSizeRes = 0;
if (packSize < lzHeaderSize)
return SZ_ERROR_OUTPUT_EOF;
packSize -= lzHeaderSize;
LzmaEnc_SaveState(p->enc);
res = LzmaEnc_CodeOneMemBlock(p->enc, p->needInitState,
outBuf + lzHeaderSize, &packSize, LZMA2_PACK_SIZE_MAX, &unpackSize);
PRF(printf("\npackSize = %7d unpackSize = %7d ", packSize, unpackSize));
if (unpackSize == 0)
return res;
if (res == SZ_OK)
useCopyBlock = (packSize + 2 >= unpackSize || packSize > (1 << 16));
else
{
if (res != SZ_ERROR_OUTPUT_EOF)
return res;
res = SZ_OK;
useCopyBlock = True;
}
if (useCopyBlock)
{
size_t destPos = 0;
PRF(printf("################# COPY "));
while (unpackSize > 0)
{
const UInt32 u = (unpackSize < LZMA2_COPY_CHUNK_SIZE) ? unpackSize : LZMA2_COPY_CHUNK_SIZE;
if (packSizeLimit - destPos < u + 3)
return SZ_ERROR_OUTPUT_EOF;
outBuf[destPos++] = (Byte)(p->srcPos == 0 ? LZMA2_CONTROL_COPY_RESET_DIC : LZMA2_CONTROL_COPY_NO_RESET);
outBuf[destPos++] = (Byte)((u - 1) >> 8);
outBuf[destPos++] = (Byte)(u - 1);
memcpy(outBuf + destPos, LzmaEnc_GetCurBuf(p->enc) - unpackSize, u);
unpackSize -= u;
destPos += u;
p->srcPos += u;
if (outStream)
{
*packSizeRes += destPos;
if (ISeqOutStream_Write(outStream, outBuf, destPos) != destPos)
return SZ_ERROR_WRITE;
destPos = 0;
}
else
*packSizeRes = destPos;
/* needInitState = True; */
}
LzmaEnc_RestoreState(p->enc);
return SZ_OK;
}
{
size_t destPos = 0;
const UInt32 u = unpackSize - 1;
const UInt32 pm = (UInt32)(packSize - 1);
const unsigned mode = (p->srcPos == 0) ? 3 : (p->needInitState ? (p->needInitProp ? 2 : 1) : 0);
PRF(printf(" "));
outBuf[destPos++] = (Byte)(LZMA2_CONTROL_LZMA | (mode << 5) | ((u >> 16) & 0x1F));
outBuf[destPos++] = (Byte)(u >> 8);
outBuf[destPos++] = (Byte)u;
outBuf[destPos++] = (Byte)(pm >> 8);
outBuf[destPos++] = (Byte)pm;
if (p->needInitProp)
outBuf[destPos++] = p->propsByte;
p->needInitProp = False;
p->needInitState = False;
destPos += packSize;
p->srcPos += unpackSize;
if (outStream)
if (ISeqOutStream_Write(outStream, outBuf, destPos) != destPos)
return SZ_ERROR_WRITE;
*packSizeRes = destPos;
return SZ_OK;
}
}
/* ---------- Lzma2 Props ---------- */
void Lzma2EncProps_Init(CLzma2EncProps *p)
{
LzmaEncProps_Init(&p->lzmaProps);
p->blockSize = LZMA2_ENC_PROPS_BLOCK_SIZE_AUTO;
p->numBlockThreads_Reduced = -1;
p->numBlockThreads_Max = -1;
p->numTotalThreads = -1;
}
void Lzma2EncProps_Normalize(CLzma2EncProps *p)
{
UInt64 fileSize;
int t1, t1n, t2, t2r, t3;
{
CLzmaEncProps lzmaProps = p->lzmaProps;
LzmaEncProps_Normalize(&lzmaProps);
t1n = lzmaProps.numThreads;
}
t1 = p->lzmaProps.numThreads;
t2 = p->numBlockThreads_Max;
t3 = p->numTotalThreads;
if (t2 > MTCODER_THREADS_MAX)
t2 = MTCODER_THREADS_MAX;
if (t3 <= 0)
{
if (t2 <= 0)
t2 = 1;
t3 = t1n * t2;
}
else if (t2 <= 0)
{
t2 = t3 / t1n;
if (t2 == 0)
{
t1 = 1;
t2 = t3;
}
if (t2 > MTCODER_THREADS_MAX)
t2 = MTCODER_THREADS_MAX;
}
else if (t1 <= 0)
{
t1 = t3 / t2;
if (t1 == 0)
t1 = 1;
}
else
t3 = t1n * t2;
p->lzmaProps.numThreads = t1;
t2r = t2;
fileSize = p->lzmaProps.reduceSize;
if ( p->blockSize != LZMA2_ENC_PROPS_BLOCK_SIZE_SOLID
&& p->blockSize != LZMA2_ENC_PROPS_BLOCK_SIZE_AUTO
&& (p->blockSize < fileSize || fileSize == (UInt64)(Int64)-1))
p->lzmaProps.reduceSize = p->blockSize;
LzmaEncProps_Normalize(&p->lzmaProps);
p->lzmaProps.reduceSize = fileSize;
t1 = p->lzmaProps.numThreads;
if (p->blockSize == LZMA2_ENC_PROPS_BLOCK_SIZE_SOLID)
{
t2r = t2 = 1;
t3 = t1;
}
else if (p->blockSize == LZMA2_ENC_PROPS_BLOCK_SIZE_AUTO && t2 <= 1)
{
/* if there is no block multi-threading, we use SOLID block */
p->blockSize = LZMA2_ENC_PROPS_BLOCK_SIZE_SOLID;
}
else
{
if (p->blockSize == LZMA2_ENC_PROPS_BLOCK_SIZE_AUTO)
{
const UInt32 kMinSize = (UInt32)1 << 20;
const UInt32 kMaxSize = (UInt32)1 << 28;
const UInt32 dictSize = p->lzmaProps.dictSize;
UInt64 blockSize = (UInt64)dictSize << 2;
if (blockSize < kMinSize) blockSize = kMinSize;
if (blockSize > kMaxSize) blockSize = kMaxSize;
if (blockSize < dictSize) blockSize = dictSize;
blockSize += (kMinSize - 1);
blockSize &= ~(UInt64)(kMinSize - 1);
p->blockSize = blockSize;
}
if (t2 > 1 && fileSize != (UInt64)(Int64)-1)
{
UInt64 numBlocks = fileSize / p->blockSize;
if (numBlocks * p->blockSize != fileSize)
numBlocks++;
if (numBlocks < (unsigned)t2)
{
t2r = (int)numBlocks;
if (t2r == 0)
t2r = 1;
t3 = t1 * t2r;
}
}
}
p->numBlockThreads_Max = t2;
p->numBlockThreads_Reduced = t2r;
p->numTotalThreads = t3;
}
static SRes Progress(ICompressProgressPtr p, UInt64 inSize, UInt64 outSize)
{
return (p && ICompressProgress_Progress(p, inSize, outSize) != SZ_OK) ? SZ_ERROR_PROGRESS : SZ_OK;
}
/* ---------- Lzma2 ---------- */
struct CLzma2Enc
{
Byte propEncoded;
CLzma2EncProps props;
UInt64 expectedDataSize;
Byte *tempBufLzma;
ISzAllocPtr alloc;
ISzAllocPtr allocBig;
CLzma2EncInt coders[MTCODER_THREADS_MAX];
#ifndef Z7_ST
ISeqOutStreamPtr outStream;
Byte *outBuf;
size_t outBuf_Rem; /* remainder in outBuf */
size_t outBufSize; /* size of allocated outBufs[i] */
size_t outBufsDataSizes[MTCODER_BLOCKS_MAX];
BoolInt mtCoder_WasConstructed;
CMtCoder mtCoder;
Byte *outBufs[MTCODER_BLOCKS_MAX];
#endif
};
CLzma2EncHandle Lzma2Enc_Create(ISzAllocPtr alloc, ISzAllocPtr allocBig)
{
CLzma2Enc *p = (CLzma2Enc *)ISzAlloc_Alloc(alloc, sizeof(CLzma2Enc));
if (!p)
return NULL;
Lzma2EncProps_Init(&p->props);
Lzma2EncProps_Normalize(&p->props);
p->expectedDataSize = (UInt64)(Int64)-1;
p->tempBufLzma = NULL;
p->alloc = alloc;
p->allocBig = allocBig;
{
unsigned i;
for (i = 0; i < MTCODER_THREADS_MAX; i++)
p->coders[i].enc = NULL;
}
#ifndef Z7_ST
p->mtCoder_WasConstructed = False;
{
unsigned i;
for (i = 0; i < MTCODER_BLOCKS_MAX; i++)
p->outBufs[i] = NULL;
p->outBufSize = 0;
}
#endif
return (CLzma2EncHandle)p;
}
#ifndef Z7_ST
static void Lzma2Enc_FreeOutBufs(CLzma2Enc *p)
{
unsigned i;
for (i = 0; i < MTCODER_BLOCKS_MAX; i++)
if (p->outBufs[i])
{
ISzAlloc_Free(p->alloc, p->outBufs[i]);
p->outBufs[i] = NULL;
}
p->outBufSize = 0;
}
#endif
// #define GET_CLzma2Enc_p CLzma2Enc *p = (CLzma2Enc *)(void *)p;
void Lzma2Enc_Destroy(CLzma2EncHandle p)
{
// GET_CLzma2Enc_p
unsigned i;
for (i = 0; i < MTCODER_THREADS_MAX; i++)
{
CLzma2EncInt *t = &p->coders[i];
if (t->enc)
{
LzmaEnc_Destroy(t->enc, p->alloc, p->allocBig);
t->enc = NULL;
}
}
#ifndef Z7_ST
if (p->mtCoder_WasConstructed)
{
MtCoder_Destruct(&p->mtCoder);
p->mtCoder_WasConstructed = False;
}
Lzma2Enc_FreeOutBufs(p);
#endif
ISzAlloc_Free(p->alloc, p->tempBufLzma);
p->tempBufLzma = NULL;
ISzAlloc_Free(p->alloc, p);
}
SRes Lzma2Enc_SetProps(CLzma2EncHandle p, const CLzma2EncProps *props)
{
// GET_CLzma2Enc_p
CLzmaEncProps lzmaProps = props->lzmaProps;
LzmaEncProps_Normalize(&lzmaProps);
if (lzmaProps.lc + lzmaProps.lp > LZMA2_LCLP_MAX)
return SZ_ERROR_PARAM;
p->props = *props;
Lzma2EncProps_Normalize(&p->props);
return SZ_OK;
}
void Lzma2Enc_SetDataSize(CLzma2EncHandle p, UInt64 expectedDataSiize)
{
// GET_CLzma2Enc_p
p->expectedDataSize = expectedDataSiize;
}
Byte Lzma2Enc_WriteProperties(CLzma2EncHandle p)
{
// GET_CLzma2Enc_p
unsigned i;
UInt32 dicSize = LzmaEncProps_GetDictSize(&p->props.lzmaProps);
for (i = 0; i < 40; i++)
if (dicSize <= LZMA2_DIC_SIZE_FROM_PROP(i))
break;
return (Byte)i;
}
static SRes Lzma2Enc_EncodeMt1(
CLzma2Enc *me,
CLzma2EncInt *p,
ISeqOutStreamPtr outStream,
Byte *outBuf, size_t *outBufSize,
ISeqInStreamPtr inStream,
const Byte *inData, size_t inDataSize,
int finished,
ICompressProgressPtr progress)
{
UInt64 unpackTotal = 0;
UInt64 packTotal = 0;
size_t outLim = 0;
CLimitedSeqInStream limitedInStream;
if (outBuf)
{
outLim = *outBufSize;
*outBufSize = 0;
}
if (!p->enc)
{
p->propsAreSet = False;
p->enc = LzmaEnc_Create(me->alloc);
if (!p->enc)
return SZ_ERROR_MEM;
}
limitedInStream.realStream = inStream;
if (inStream)
{
limitedInStream.vt.Read = LimitedSeqInStream_Read;
}
if (!outBuf)
{
// outStream version works only in one thread. So we use CLzma2Enc::tempBufLzma
if (!me->tempBufLzma)
{
me->tempBufLzma = (Byte *)ISzAlloc_Alloc(me->alloc, LZMA2_CHUNK_SIZE_COMPRESSED_MAX);
if (!me->tempBufLzma)
return SZ_ERROR_MEM;
}
}
RINOK(Lzma2EncInt_InitStream(p, &me->props))
for (;;)
{
SRes res = SZ_OK;
SizeT inSizeCur = 0;
Lzma2EncInt_InitBlock(p);
LimitedSeqInStream_Init(&limitedInStream);
limitedInStream.limit = me->props.blockSize;
if (inStream)
{
UInt64 expected = (UInt64)(Int64)-1;
// inStream version works only in one thread. So we use CLzma2Enc::expectedDataSize
if (me->expectedDataSize != (UInt64)(Int64)-1
&& me->expectedDataSize >= unpackTotal)
expected = me->expectedDataSize - unpackTotal;
if (me->props.blockSize != LZMA2_ENC_PROPS_BLOCK_SIZE_SOLID
&& expected > me->props.blockSize)
expected = (size_t)me->props.blockSize;
LzmaEnc_SetDataSize(p->enc, expected);
RINOK(LzmaEnc_PrepareForLzma2(p->enc,
&limitedInStream.vt,
LZMA2_KEEP_WINDOW_SIZE,
me->alloc,
me->allocBig))
}
else
{
inSizeCur = (SizeT)(inDataSize - (size_t)unpackTotal);
if (me->props.blockSize != LZMA2_ENC_PROPS_BLOCK_SIZE_SOLID
&& inSizeCur > me->props.blockSize)
inSizeCur = (SizeT)(size_t)me->props.blockSize;
// LzmaEnc_SetDataSize(p->enc, inSizeCur);
RINOK(LzmaEnc_MemPrepare(p->enc,
inData + (size_t)unpackTotal, inSizeCur,
LZMA2_KEEP_WINDOW_SIZE,
me->alloc,
me->allocBig))
}
for (;;)
{
size_t packSize = LZMA2_CHUNK_SIZE_COMPRESSED_MAX;
if (outBuf)
packSize = outLim - (size_t)packTotal;
res = Lzma2EncInt_EncodeSubblock(p,
outBuf ? outBuf + (size_t)packTotal : me->tempBufLzma, &packSize,
outBuf ? NULL : outStream);
if (res != SZ_OK)
break;
packTotal += packSize;
if (outBuf)
*outBufSize = (size_t)packTotal;
res = Progress(progress, unpackTotal + p->srcPos, packTotal);
if (res != SZ_OK)
break;
/*
if (LzmaEnc_GetNumAvailableBytes(p->enc) == 0)
break;
*/
if (packSize == 0)
break;
}
LzmaEnc_Finish(p->enc);
unpackTotal += p->srcPos;
RINOK(res)
if (p->srcPos != (inStream ? limitedInStream.processed : inSizeCur))
return SZ_ERROR_FAIL;
if (inStream ? limitedInStream.finished : (unpackTotal == inDataSize))
{
if (finished)
{
if (outBuf)
{
const size_t destPos = *outBufSize;
if (destPos >= outLim)
return SZ_ERROR_OUTPUT_EOF;
outBuf[destPos] = LZMA2_CONTROL_EOF; // 0
*outBufSize = destPos + 1;
}
else
{
const Byte b = LZMA2_CONTROL_EOF; // 0;
if (ISeqOutStream_Write(outStream, &b, 1) != 1)
return SZ_ERROR_WRITE;
}
}
return SZ_OK;
}
}
}
#ifndef Z7_ST
static SRes Lzma2Enc_MtCallback_Code(void *p, unsigned coderIndex, unsigned outBufIndex,
const Byte *src, size_t srcSize, int finished)
{
CLzma2Enc *me = (CLzma2Enc *)p;
size_t destSize = me->outBufSize;
SRes res;
CMtProgressThunk progressThunk;
Byte *dest = me->outBufs[outBufIndex];
me->outBufsDataSizes[outBufIndex] = 0;
if (!dest)
{
dest = (Byte *)ISzAlloc_Alloc(me->alloc, me->outBufSize);
if (!dest)
return SZ_ERROR_MEM;
me->outBufs[outBufIndex] = dest;
}
MtProgressThunk_CreateVTable(&progressThunk);
progressThunk.mtProgress = &me->mtCoder.mtProgress;
progressThunk.inSize = 0;
progressThunk.outSize = 0;
res = Lzma2Enc_EncodeMt1(me,
&me->coders[coderIndex],
NULL, dest, &destSize,
NULL, src, srcSize,
finished,
&progressThunk.vt);
me->outBufsDataSizes[outBufIndex] = destSize;
return res;
}
static SRes Lzma2Enc_MtCallback_Write(void *p, unsigned outBufIndex)
{
CLzma2Enc *me = (CLzma2Enc *)p;
size_t size = me->outBufsDataSizes[outBufIndex];
const Byte *data = me->outBufs[outBufIndex];
if (me->outStream)
return ISeqOutStream_Write(me->outStream, data, size) == size ? SZ_OK : SZ_ERROR_WRITE;
if (size > me->outBuf_Rem)
return SZ_ERROR_OUTPUT_EOF;
memcpy(me->outBuf, data, size);
me->outBuf_Rem -= size;
me->outBuf += size;
return SZ_OK;
}
#endif
SRes Lzma2Enc_Encode2(CLzma2EncHandle p,
ISeqOutStreamPtr outStream,
Byte *outBuf, size_t *outBufSize,
ISeqInStreamPtr inStream,
const Byte *inData, size_t inDataSize,
ICompressProgressPtr progress)
{
// GET_CLzma2Enc_p
if (inStream && inData)
return SZ_ERROR_PARAM;
if (outStream && outBuf)
return SZ_ERROR_PARAM;
{
unsigned i;
for (i = 0; i < MTCODER_THREADS_MAX; i++)
p->coders[i].propsAreSet = False;
}
#ifndef Z7_ST
if (p->props.numBlockThreads_Reduced > 1)
{
IMtCoderCallback2 vt;
if (!p->mtCoder_WasConstructed)
{
p->mtCoder_WasConstructed = True;
MtCoder_Construct(&p->mtCoder);
}
vt.Code = Lzma2Enc_MtCallback_Code;
vt.Write = Lzma2Enc_MtCallback_Write;
p->outStream = outStream;
p->outBuf = NULL;
p->outBuf_Rem = 0;
if (!outStream)
{
p->outBuf = outBuf;
p->outBuf_Rem = *outBufSize;
*outBufSize = 0;
}
p->mtCoder.allocBig = p->allocBig;
p->mtCoder.progress = progress;
p->mtCoder.inStream = inStream;
p->mtCoder.inData = inData;
p->mtCoder.inDataSize = inDataSize;
p->mtCoder.mtCallback = &vt;
p->mtCoder.mtCallbackObject = p;
p->mtCoder.blockSize = (size_t)p->props.blockSize;
if (p->mtCoder.blockSize != p->props.blockSize)
return SZ_ERROR_PARAM; /* SZ_ERROR_MEM */
{
const size_t destBlockSize = p->mtCoder.blockSize + (p->mtCoder.blockSize >> 10) + 16;
if (destBlockSize < p->mtCoder.blockSize)
return SZ_ERROR_PARAM;
if (p->outBufSize != destBlockSize)
Lzma2Enc_FreeOutBufs(p);
p->outBufSize = destBlockSize;
}
p->mtCoder.numThreadsMax = (unsigned)p->props.numBlockThreads_Max;
p->mtCoder.expectedDataSize = p->expectedDataSize;
{
const SRes res = MtCoder_Code(&p->mtCoder);
if (!outStream)
*outBufSize = (size_t)(p->outBuf - outBuf);
return res;
}
}
#endif
return Lzma2Enc_EncodeMt1(p,
&p->coders[0],
outStream, outBuf, outBufSize,
inStream, inData, inDataSize,
True, /* finished */
progress);
}
#undef PRF

57
C/Lzma2Enc.h Executable file
View File

@@ -0,0 +1,57 @@
/* Lzma2Enc.h -- LZMA2 Encoder
2023-04-13 : Igor Pavlov : Public domain */
#ifndef ZIP7_INC_LZMA2_ENC_H
#define ZIP7_INC_LZMA2_ENC_H
#include "LzmaEnc.h"
EXTERN_C_BEGIN
#define LZMA2_ENC_PROPS_BLOCK_SIZE_AUTO 0
#define LZMA2_ENC_PROPS_BLOCK_SIZE_SOLID ((UInt64)(Int64)-1)
typedef struct
{
CLzmaEncProps lzmaProps;
UInt64 blockSize;
int numBlockThreads_Reduced;
int numBlockThreads_Max;
int numTotalThreads;
} CLzma2EncProps;
void Lzma2EncProps_Init(CLzma2EncProps *p);
void Lzma2EncProps_Normalize(CLzma2EncProps *p);
/* ---------- CLzmaEnc2Handle Interface ---------- */
/* Lzma2Enc_* functions can return the following exit codes:
SRes:
SZ_OK - OK
SZ_ERROR_MEM - Memory allocation error
SZ_ERROR_PARAM - Incorrect paramater in props
SZ_ERROR_WRITE - ISeqOutStream write callback error
SZ_ERROR_OUTPUT_EOF - output buffer overflow - version with (Byte *) output
SZ_ERROR_PROGRESS - some break from progress callback
SZ_ERROR_THREAD - error in multithreading functions (only for Mt version)
*/
typedef struct CLzma2Enc CLzma2Enc;
typedef CLzma2Enc * CLzma2EncHandle;
// Z7_DECLARE_HANDLE(CLzma2EncHandle)
CLzma2EncHandle Lzma2Enc_Create(ISzAllocPtr alloc, ISzAllocPtr allocBig);
void Lzma2Enc_Destroy(CLzma2EncHandle p);
SRes Lzma2Enc_SetProps(CLzma2EncHandle p, const CLzma2EncProps *props);
void Lzma2Enc_SetDataSize(CLzma2EncHandle p, UInt64 expectedDataSiize);
Byte Lzma2Enc_WriteProperties(CLzma2EncHandle p);
SRes Lzma2Enc_Encode2(CLzma2EncHandle p,
ISeqOutStreamPtr outStream,
Byte *outBuf, size_t *outBufSize,
ISeqInStreamPtr inStream,
const Byte *inData, size_t inDataSize,
ICompressProgressPtr progress);
EXTERN_C_END
#endif

View File

@@ -1,12 +1,15 @@
/* Lzma86Enc.h -- LZMA + x86 (BCJ) Filter Encoder
2008-08-05
Igor Pavlov
Public domain */
/* Lzma86.h -- LZMA + x86 (BCJ) Filter
2023-03-03 : Igor Pavlov : Public domain */
#ifndef __LZMA86ENC_H
#define __LZMA86ENC_H
#ifndef ZIP7_INC_LZMA86_H
#define ZIP7_INC_LZMA86_H
#include "../Types.h"
#include "7zTypes.h"
EXTERN_C_BEGIN
#define LZMA86_SIZE_OFFSET (1 + 5)
#define LZMA86_HEADER_SIZE (LZMA86_SIZE_OFFSET + 8)
/*
It's an example for LZMA + x86 Filter use.
@@ -14,8 +17,8 @@ You can use .lzma86 extension, if you write that stream to file.
.lzma86 header adds one additional byte to standard .lzma header.
.lzma86 header (14 bytes):
Offset Size Description
0 1 = 0 - no filter,
= 1 - x86 filter
0 1 = 0 - no filter, pure LZMA
= 1 - x86 filter + LZMA
1 1 lc, lp and pb in encoded form
2 4 dictSize (little endian)
6 8 uncompressed size (little endian)
@@ -25,7 +28,6 @@ Lzma86_Encode
-------------
level - compression level: 0 <= level <= 9, the default value for "level" is 5.
dictSize - The dictionary size in bytes. The maximum value is
128 MB = (1 << 27) bytes for 32-bit version
1 GB = (1 << 30) bytes for 64-bit version
@@ -69,4 +71,41 @@ enum ESzFilterMode
SRes Lzma86_Encode(Byte *dest, size_t *destLen, const Byte *src, size_t srcLen,
int level, UInt32 dictSize, int filterMode);
/*
Lzma86_GetUnpackSize:
In:
src - input data
srcLen - input data size
Out:
unpackSize - size of uncompressed stream
Return code:
SZ_OK - OK
SZ_ERROR_INPUT_EOF - Error in headers
*/
SRes Lzma86_GetUnpackSize(const Byte *src, SizeT srcLen, UInt64 *unpackSize);
/*
Lzma86_Decode:
In:
dest - output data
destLen - output data size
src - input data
srcLen - input data size
Out:
destLen - processed output size
srcLen - processed input size
Return code:
SZ_OK - OK
SZ_ERROR_DATA - Data error
SZ_ERROR_MEM - Memory allocation error
SZ_ERROR_UNSUPPORTED - unsupported file
SZ_ERROR_INPUT_EOF - it needs more bytes in input buffer
*/
SRes Lzma86_Decode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen);
EXTERN_C_END
#endif

View File

@@ -1,20 +1,13 @@
/* Lzma86Dec.c -- LZMA + x86 (BCJ) Filter Decoder
2008-04-07
Igor Pavlov
Public domain */
2023-03-03 : Igor Pavlov : Public domain */
#include "Lzma86Dec.h"
#include "Precomp.h"
#include "../Alloc.h"
#include "../Bra.h"
#include "../LzmaDec.h"
#include "Lzma86.h"
#define LZMA86_SIZE_OFFSET (1 + LZMA_PROPS_SIZE)
#define LZMA86_HEADER_SIZE (LZMA86_SIZE_OFFSET + 8)
static void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); }
static void SzFree(void *p, void *address) { p = p; MyFree(address); }
static ISzAlloc g_Alloc = { SzAlloc, SzFree };
#include "Alloc.h"
#include "Bra.h"
#include "LzmaDec.h"
SRes Lzma86_GetUnpackSize(const Byte *src, SizeT srcLen, UInt64 *unpackSize)
{
@@ -53,9 +46,8 @@ SRes Lzma86_Decode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen)
return res;
if (useFilter == 1)
{
UInt32 x86State;
x86_Convert_Init(x86State);
x86_Convert(dest, *destLen, 0, &x86State, 0);
UInt32 x86State = Z7_BRANCH_CONV_ST_X86_STATE_INIT_VAL;
z7_BranchConvSt_X86_Dec(dest, *destLen, 0, &x86State);
}
return SZ_OK;
}

View File

@@ -1,31 +1,22 @@
/* Lzma86Enc.c -- LZMA + x86 (BCJ) Filter Encoder
2008-08-05
Igor Pavlov
Public domain */
2023-03-03 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include <string.h>
#include "Lzma86Enc.h"
#include "Lzma86.h"
#include "../Alloc.h"
#include "../Bra.h"
#include "../LzmaEnc.h"
#define SZE_OUT_OVERFLOW SZE_DATA_ERROR
static void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); }
static void SzFree(void *p, void *address) { p = p; MyFree(address); }
static ISzAlloc g_Alloc = { SzAlloc, SzFree };
#define LZMA86_SIZE_OFFSET (1 + LZMA_PROPS_SIZE)
#define LZMA86_HEADER_SIZE (LZMA86_SIZE_OFFSET + 8)
#include "Alloc.h"
#include "Bra.h"
#include "LzmaEnc.h"
int Lzma86_Encode(Byte *dest, size_t *destLen, const Byte *src, size_t srcLen,
int level, UInt32 dictSize, int filterMode)
{
size_t outSize2 = *destLen;
Byte *filteredStream;
Bool useFilter;
BoolInt useFilter;
int mainResult = SZ_ERROR_OUTPUT_EOF;
CLzmaEncProps props;
LzmaEncProps_Init(&props);
@@ -55,15 +46,14 @@ int Lzma86_Encode(Byte *dest, size_t *destLen, const Byte *src, size_t srcLen,
memcpy(filteredStream, src, srcLen);
}
{
UInt32 x86State;
x86_Convert_Init(x86State);
x86_Convert(filteredStream, srcLen, 0, &x86State, 1);
UInt32 x86State = Z7_BRANCH_CONV_ST_X86_STATE_INIT_VAL;
z7_BranchConvSt_X86_Enc(filteredStream, srcLen, 0, &x86State);
}
}
{
size_t minSize = 0;
Bool bestIsFiltered = False;
BoolInt bestIsFiltered = False;
/* passes for SZ_FILTER_AUTO:
0 - BCJ + LZMA
@@ -78,7 +68,7 @@ int Lzma86_Encode(Byte *dest, size_t *destLen, const Byte *src, size_t srcLen,
size_t outSizeProcessed = outSize2 - LZMA86_HEADER_SIZE;
size_t outPropsSize = 5;
SRes curRes;
Bool curModeIsFiltered = (numPasses > 1 && i == numPasses - 1);
BoolInt curModeIsFiltered = (numPasses > 1 && i == numPasses - 1);
if (curModeIsFiltered && !bestIsFiltered)
break;
if (useFilter && i == 0)
@@ -104,7 +94,7 @@ int Lzma86_Encode(Byte *dest, size_t *destLen, const Byte *src, size_t srcLen,
}
}
}
dest[0] = (bestIsFiltered ? 1 : 0);
dest[0] = (Byte)(bestIsFiltered ? 1 : 0);
*destLen = LZMA86_HEADER_SIZE + minSize;
}
if (useFilter)

View File

File diff suppressed because it is too large Load Diff

View File

@@ -1,29 +1,36 @@
/* LzmaDec.h -- LZMA Decoder
2008-10-04 : Igor Pavlov : Public domain */
2023-04-02 : Igor Pavlov : Public domain */
#ifndef __LZMADEC_H
#define __LZMADEC_H
#ifndef ZIP7_INC_LZMA_DEC_H
#define ZIP7_INC_LZMA_DEC_H
#include "Types.h"
#include "7zTypes.h"
/* #define _LZMA_PROB32 */
/* _LZMA_PROB32 can increase the speed on some CPUs,
EXTERN_C_BEGIN
/* #define Z7_LZMA_PROB32 */
/* Z7_LZMA_PROB32 can increase the speed on some CPUs,
but memory usage for CLzmaDec::probs will be doubled in that case */
#ifdef _LZMA_PROB32
#define CLzmaProb UInt32
typedef
#ifdef Z7_LZMA_PROB32
UInt32
#else
#define CLzmaProb UInt16
UInt16
#endif
CLzmaProb;
/* ---------- LZMA Properties ---------- */
#define LZMA_PROPS_SIZE 5
typedef struct _CLzmaProps
typedef struct
{
unsigned lc, lp, pb;
Byte lc;
Byte lp;
Byte pb;
Byte _pad_;
UInt32 dicSize;
} CLzmaProps;
@@ -45,32 +52,35 @@ SRes LzmaProps_Decode(CLzmaProps *p, const Byte *data, unsigned size);
typedef struct
{
/* Don't change this structure. ASM code can use it. */
CLzmaProps prop;
CLzmaProb *probs;
CLzmaProb *probs_1664;
Byte *dic;
const Byte *buf;
UInt32 range, code;
SizeT dicPos;
SizeT dicBufSize;
SizeT dicPos;
const Byte *buf;
UInt32 range;
UInt32 code;
UInt32 processedPos;
UInt32 checkDicSize;
unsigned state;
UInt32 reps[4];
unsigned remainLen;
int needFlush;
int needInitState;
UInt32 state;
UInt32 remainLen;
UInt32 numProbs;
unsigned tempBufSize;
Byte tempBuf[LZMA_REQUIRED_INPUT_MAX];
} CLzmaDec;
#define LzmaDec_Construct(p) { (p)->dic = 0; (p)->probs = 0; }
#define LzmaDec_CONSTRUCT(p) { (p)->dic = NULL; (p)->probs = NULL; }
#define LzmaDec_Construct(p) LzmaDec_CONSTRUCT(p)
void LzmaDec_Init(CLzmaDec *p);
/* There are two types of LZMA streams:
0) Stream with end mark. That end mark adds about 6 bytes to compressed size.
1) Stream without end mark. You must know exact uncompressed size to decompress such stream. */
- Stream with end mark. That end mark adds about 6 bytes to compressed size.
- Stream without end mark. You must know exact uncompressed size to decompress such stream. */
typedef enum
{
@@ -127,11 +137,11 @@ LzmaDec_Allocate* can return:
SZ_ERROR_UNSUPPORTED - Unsupported properties
*/
SRes LzmaDec_AllocateProbs(CLzmaDec *p, const Byte *props, unsigned propsSize, ISzAlloc *alloc);
void LzmaDec_FreeProbs(CLzmaDec *p, ISzAlloc *alloc);
SRes LzmaDec_AllocateProbs(CLzmaDec *p, const Byte *props, unsigned propsSize, ISzAllocPtr alloc);
void LzmaDec_FreeProbs(CLzmaDec *p, ISzAllocPtr alloc);
SRes LzmaDec_Allocate(CLzmaDec *state, const Byte *prop, unsigned propsSize, ISzAlloc *alloc);
void LzmaDec_Free(CLzmaDec *state, ISzAlloc *alloc);
SRes LzmaDec_Allocate(CLzmaDec *p, const Byte *props, unsigned propsSize, ISzAllocPtr alloc);
void LzmaDec_Free(CLzmaDec *p, ISzAllocPtr alloc);
/* ---------- Dictionary Interface ---------- */
@@ -140,7 +150,7 @@ void LzmaDec_Free(CLzmaDec *state, ISzAlloc *alloc);
You must work with CLzmaDec variables directly in this interface.
STEPS:
LzmaDec_Constr()
LzmaDec_Construct()
LzmaDec_Allocate()
for (each new stream)
{
@@ -172,6 +182,7 @@ Returns:
LZMA_STATUS_NEEDS_MORE_INPUT
LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK
SZ_ERROR_DATA - Data error
SZ_ERROR_FAIL - Some unexpected error: internal error of code, memory corruption or hardware failure
*/
SRes LzmaDec_DecodeToDic(CLzmaDec *p, SizeT dicLimit,
@@ -214,10 +225,13 @@ Returns:
SZ_ERROR_MEM - Memory allocation error
SZ_ERROR_UNSUPPORTED - Unsupported properties
SZ_ERROR_INPUT_EOF - It needs more bytes in input buffer (src).
SZ_ERROR_FAIL - Some unexpected error: internal error of code, memory corruption or hardware failure
*/
SRes LzmaDecode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen,
const Byte *propData, unsigned propSize, ELzmaFinishMode finishMode,
ELzmaStatus *status, ISzAlloc *alloc);
ELzmaStatus *status, ISzAllocPtr alloc);
EXTERN_C_END
#endif

View File

File diff suppressed because it is too large Load Diff

View File

@@ -1,19 +1,21 @@
/* LzmaEnc.h -- LZMA Encoder
2008-10-04 : Igor Pavlov : Public domain */
2023-04-13 : Igor Pavlov : Public domain */
#ifndef __LZMAENC_H
#define __LZMAENC_H
#ifndef ZIP7_INC_LZMA_ENC_H
#define ZIP7_INC_LZMA_ENC_H
#include "Types.h"
#include "7zTypes.h"
EXTERN_C_BEGIN
#define LZMA_PROPS_SIZE 5
typedef struct _CLzmaEncProps
typedef struct
{
int level; /* 0 <= level <= 9 */
int level; /* 0 <= level <= 9 */
UInt32 dictSize; /* (1 << 12) <= dictSize <= (1 << 27) for 32-bit version
(1 << 12) <= dictSize <= (1 << 30) for 64-bit version
default = (1 << 24) */
(1 << 12) <= dictSize <= (3 << 29) for 64-bit version
default = (1 << 24) */
int lc; /* 0 <= lc <= 8, default = 3 */
int lp; /* 0 <= lp <= 4, default = 0 */
int pb; /* 0 <= pb <= 4, default = 2 */
@@ -21,9 +23,17 @@ typedef struct _CLzmaEncProps
int fb; /* 5 <= fb <= 273, default = 32 */
int btMode; /* 0 - hashChain Mode, 1 - binTree mode - normal, default = 1 */
int numHashBytes; /* 2, 3 or 4, default = 4 */
UInt32 mc; /* 1 <= mc <= (1 << 30), default = 32 */
unsigned numHashOutBits; /* default = ? */
UInt32 mc; /* 1 <= mc <= (1 << 30), default = 32 */
unsigned writeEndMark; /* 0 - do not write EOPM, 1 - write EOPM, default = 0 */
int numThreads; /* 1 or 2, default = 2 */
// int _pad;
UInt64 reduceSize; /* estimated size of data that will be compressed. default = (UInt64)(Int64)-1.
Encoder uses this value to reduce dictionary size */
UInt64 affinity;
} CLzmaEncProps;
void LzmaEncProps_Init(CLzmaEncProps *p);
@@ -33,40 +43,41 @@ UInt32 LzmaEncProps_GetDictSize(const CLzmaEncProps *props2);
/* ---------- CLzmaEncHandle Interface ---------- */
/* LzmaEnc_* functions can return the following exit codes:
Returns:
/* LzmaEnc* functions can return the following exit codes:
SRes:
SZ_OK - OK
SZ_ERROR_MEM - Memory allocation error
SZ_ERROR_PARAM - Incorrect paramater in props
SZ_ERROR_WRITE - Write callback error.
SZ_ERROR_WRITE - ISeqOutStream write callback error
SZ_ERROR_OUTPUT_EOF - output buffer overflow - version with (Byte *) output
SZ_ERROR_PROGRESS - some break from progress callback
SZ_ERROR_THREAD - errors in multithreading functions (only for Mt version)
SZ_ERROR_THREAD - error in multithreading functions (only for Mt version)
*/
typedef void * CLzmaEncHandle;
typedef struct CLzmaEnc CLzmaEnc;
typedef CLzmaEnc * CLzmaEncHandle;
// Z7_DECLARE_HANDLE(CLzmaEncHandle)
CLzmaEncHandle LzmaEnc_Create(ISzAllocPtr alloc);
void LzmaEnc_Destroy(CLzmaEncHandle p, ISzAllocPtr alloc, ISzAllocPtr allocBig);
CLzmaEncHandle LzmaEnc_Create(ISzAlloc *alloc);
void LzmaEnc_Destroy(CLzmaEncHandle p, ISzAlloc *alloc, ISzAlloc *allocBig);
SRes LzmaEnc_SetProps(CLzmaEncHandle p, const CLzmaEncProps *props);
void LzmaEnc_SetDataSize(CLzmaEncHandle p, UInt64 expectedDataSiize);
SRes LzmaEnc_WriteProperties(CLzmaEncHandle p, Byte *properties, SizeT *size);
SRes LzmaEnc_Encode(CLzmaEncHandle p, ISeqOutStream *outStream, ISeqInStream *inStream,
ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig);
unsigned LzmaEnc_IsWriteEndMark(CLzmaEncHandle p);
SRes LzmaEnc_Encode(CLzmaEncHandle p, ISeqOutStreamPtr outStream, ISeqInStreamPtr inStream,
ICompressProgressPtr progress, ISzAllocPtr alloc, ISzAllocPtr allocBig);
SRes LzmaEnc_MemEncode(CLzmaEncHandle p, Byte *dest, SizeT *destLen, const Byte *src, SizeT srcLen,
int writeEndMark, ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig);
int writeEndMark, ICompressProgressPtr progress, ISzAllocPtr alloc, ISzAllocPtr allocBig);
/* ---------- One Call Interface ---------- */
/* LzmaEncode
Return code:
SZ_OK - OK
SZ_ERROR_MEM - Memory allocation error
SZ_ERROR_PARAM - Incorrect paramater
SZ_ERROR_OUTPUT_EOF - output buffer overflow
SZ_ERROR_THREAD - errors in multithreading functions (only for Mt version)
*/
SRes LzmaEncode(Byte *dest, SizeT *destLen, const Byte *src, SizeT srcLen,
const CLzmaEncProps *props, Byte *propsEncoded, SizeT *propsSize, int writeEndMark,
ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig);
ICompressProgressPtr progress, ISzAllocPtr alloc, ISzAllocPtr allocBig);
EXTERN_C_END
#endif

View File

@@ -1,18 +1,14 @@
/* LzmaLib.c -- LZMA library wrapper
2008-08-05
Igor Pavlov
Public domain */
2023-04-02 : Igor Pavlov : Public domain */
#include "Precomp.h"
#include "LzmaEnc.h"
#include "LzmaDec.h"
#include "Alloc.h"
#include "LzmaDec.h"
#include "LzmaEnc.h"
#include "LzmaLib.h"
static void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); }
static void SzFree(void *p, void *address) { p = p; MyFree(address); }
static ISzAlloc g_Alloc = { SzAlloc, SzFree };
MY_STDAPI LzmaCompress(unsigned char *dest, size_t *destLen, const unsigned char *src, size_t srcLen,
Z7_STDAPI LzmaCompress(unsigned char *dest, size_t *destLen, const unsigned char *src, size_t srcLen,
unsigned char *outProps, size_t *outPropsSize,
int level, /* 0 <= level <= 9, default = 5 */
unsigned dictSize, /* use (1 << N) or (3 << N). 4 KB < dictSize <= 128 MB */
@@ -38,7 +34,7 @@ MY_STDAPI LzmaCompress(unsigned char *dest, size_t *destLen, const unsigned cha
}
MY_STDAPI LzmaUncompress(unsigned char *dest, size_t *destLen, const unsigned char *src, size_t *srcLen,
Z7_STDAPI LzmaUncompress(unsigned char *dest, size_t *destLen, const unsigned char *src, size_t *srcLen,
const unsigned char *props, size_t propsSize)
{
ELzmaStatus status;

View File

@@ -1,20 +1,14 @@
/* LzmaLib.h -- LZMA library interface
2008-08-05
Igor Pavlov
Public domain */
2023-04-02 : Igor Pavlov : Public domain */
#ifndef __LZMALIB_H
#define __LZMALIB_H
#ifndef ZIP7_INC_LZMA_LIB_H
#define ZIP7_INC_LZMA_LIB_H
#include "Types.h"
#include "7zTypes.h"
#ifdef __cplusplus
#define MY_EXTERN_C extern "C"
#else
#define MY_EXTERN_C extern
#endif
EXTERN_C_BEGIN
#define MY_STDAPI MY_EXTERN_C int MY_STD_CALL
#define Z7_STDAPI int Z7_STDCALL
#define LZMA_PROPS_SIZE 5
@@ -46,14 +40,16 @@ outPropsSize -
level - compression level: 0 <= level <= 9;
level dictSize algo fb
0: 16 KB 0 32
1: 64 KB 0 32
2: 256 KB 0 32
3: 1 MB 0 32
4: 4 MB 0 32
0: 64 KB 0 32
1: 256 KB 0 32
2: 1 MB 0 32
3: 4 MB 0 32
4: 16 MB 0 32
5: 16 MB 1 32
6: 32 MB 1 32
7+: 64 MB 1 64
7: 32 MB 1 64
8: 64 MB 1 64
9: 64 MB 1 64
The default value for "level" is 5.
@@ -89,6 +85,11 @@ fb - Word size (the number of fast bytes).
numThreads - The number of thereads. 1 or 2. The default value is 2.
Fast mode (algo = 0) can use only 1 thread.
In:
dest - output data buffer
destLen - output data buffer size
src - input data
srcLen - input data size
Out:
destLen - processed output size
Returns:
@@ -99,7 +100,7 @@ Returns:
SZ_ERROR_THREAD - errors in multithreading functions (only for Mt version)
*/
MY_STDAPI LzmaCompress(unsigned char *dest, size_t *destLen, const unsigned char *src, size_t srcLen,
Z7_STDAPI LzmaCompress(unsigned char *dest, size_t *destLen, const unsigned char *src, size_t srcLen,
unsigned char *outProps, size_t *outPropsSize, /* *outPropsSize must be = 5 */
int level, /* 0 <= level <= 9, default = 5 */
unsigned dictSize, /* default = (1 << 24) */
@@ -114,8 +115,8 @@ MY_STDAPI LzmaCompress(unsigned char *dest, size_t *destLen, const unsigned char
LzmaUncompress
--------------
In:
dest - output data
destLen - output data size
dest - output data buffer
destLen - output data buffer size
src - input data
srcLen - input data size
Out:
@@ -129,7 +130,9 @@ Returns:
SZ_ERROR_INPUT_EOF - it needs more bytes in input buffer (src)
*/
MY_STDAPI LzmaUncompress(unsigned char *dest, size_t *destLen, const unsigned char *src, SizeT *srcLen,
Z7_STDAPI LzmaUncompress(unsigned char *dest, size_t *destLen, const unsigned char *src, SizeT *srcLen,
const unsigned char *props, size_t propsSize);
EXTERN_C_END
#endif

View File

@@ -1,12 +0,0 @@
/* LzmaLibExports.c -- LZMA library DLL Entry point
2008-10-04 : Igor Pavlov : Public domain */
#include <windows.h>
BOOL WINAPI DllMain(HINSTANCE hInstance, DWORD dwReason, LPVOID lpReserved)
{
hInstance = hInstance;
dwReason = dwReason;
lpReserved = lpReserved;
return TRUE;
}

View File

@@ -1,45 +0,0 @@
/* Lzma86Dec.h -- LZMA + x86 (BCJ) Filter Decoder
2008-08-05
Igor Pavlov
Public domain */
#ifndef __LZMA86DEC_H
#define __LZMA86DEC_H
#include "../Types.h"
/*
Lzma86_GetUnpackSize:
In:
src - input data
srcLen - input data size
Out:
unpackSize - size of uncompressed stream
Return code:
SZ_OK - OK
SZ_ERROR_INPUT_EOF - Error in headers
*/
SRes Lzma86_GetUnpackSize(const Byte *src, SizeT srcLen, UInt64 *unpackSize);
/*
Lzma86_Decode:
In:
dest - output data
destLen - output data size
src - input data
srcLen - input data size
Out:
destLen - processed output size
srcLen - processed input size
Return code:
SZ_OK - OK
SZ_ERROR_DATA - Data error
SZ_ERROR_MEM - Memory allocation error
SZ_ERROR_UNSUPPORTED - unsupported file
SZ_ERROR_INPUT_EOF - it needs more bytes in input buffer
*/
SRes Lzma86_Decode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen);
#endif

View File

@@ -1,254 +0,0 @@
/* LzmaUtil.c -- Test application for LZMA compression
2008-11-23 : Igor Pavlov : Public domain */
#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "../Alloc.h"
#include "../7zFile.h"
#include "../7zVersion.h"
#include "../LzmaDec.h"
#include "../LzmaEnc.h"
const char *kCantReadMessage = "Can not read input file";
const char *kCantWriteMessage = "Can not write output file";
const char *kCantAllocateMessage = "Can not allocate memory";
const char *kDataErrorMessage = "Data error";
static void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); }
static void SzFree(void *p, void *address) { p = p; MyFree(address); }
static ISzAlloc g_Alloc = { SzAlloc, SzFree };
void PrintHelp(char *buffer)
{
strcat(buffer, "\nLZMA Utility " MY_VERSION_COPYRIGHT_DATE "\n"
"\nUsage: lzma <e|d> inputFile outputFile\n"
" e: encode file\n"
" d: decode file\n");
}
int PrintError(char *buffer, const char *message)
{
strcat(buffer, "\nError: ");
strcat(buffer, message);
strcat(buffer, "\n");
return 1;
}
int PrintErrorNumber(char *buffer, SRes val)
{
sprintf(buffer + strlen(buffer), "\nError code: %x\n", (unsigned)val);
return 1;
}
int PrintUserError(char *buffer)
{
return PrintError(buffer, "Incorrect command");
}
#define IN_BUF_SIZE (1 << 16)
#define OUT_BUF_SIZE (1 << 16)
static SRes Decode2(CLzmaDec *state, ISeqOutStream *outStream, ISeqInStream *inStream,
UInt64 unpackSize)
{
int thereIsSize = (unpackSize != (UInt64)(Int64)-1);
Byte inBuf[IN_BUF_SIZE];
Byte outBuf[OUT_BUF_SIZE];
size_t inPos = 0, inSize = 0, outPos = 0;
LzmaDec_Init(state);
for (;;)
{
if (inPos == inSize)
{
inSize = IN_BUF_SIZE;
RINOK(inStream->Read(inStream, inBuf, &inSize));
inPos = 0;
}
{
SRes res;
SizeT inProcessed = inSize - inPos;
SizeT outProcessed = OUT_BUF_SIZE - outPos;
ELzmaFinishMode finishMode = LZMA_FINISH_ANY;
ELzmaStatus status;
if (thereIsSize && outProcessed > unpackSize)
{
outProcessed = (SizeT)unpackSize;
finishMode = LZMA_FINISH_END;
}
res = LzmaDec_DecodeToBuf(state, outBuf + outPos, &outProcessed,
inBuf + inPos, &inProcessed, finishMode, &status);
inPos += inProcessed;
outPos += outProcessed;
unpackSize -= outProcessed;
if (outStream)
if (outStream->Write(outStream, outBuf, outPos) != outPos)
return SZ_ERROR_WRITE;
outPos = 0;
if (res != SZ_OK || thereIsSize && unpackSize == 0)
return res;
if (inProcessed == 0 && outProcessed == 0)
{
if (thereIsSize || status != LZMA_STATUS_FINISHED_WITH_MARK)
return SZ_ERROR_DATA;
return res;
}
}
}
}
static SRes Decode(ISeqOutStream *outStream, ISeqInStream *inStream)
{
UInt64 unpackSize;
int i;
SRes res = 0;
CLzmaDec state;
/* header: 5 bytes of LZMA properties and 8 bytes of uncompressed size */
unsigned char header[LZMA_PROPS_SIZE + 8];
/* Read and parse header */
RINOK(SeqInStream_Read(inStream, header, sizeof(header)));
unpackSize = 0;
for (i = 0; i < 8; i++)
unpackSize += (UInt64)header[LZMA_PROPS_SIZE + i] << (i * 8);
LzmaDec_Construct(&state);
RINOK(LzmaDec_Allocate(&state, header, LZMA_PROPS_SIZE, &g_Alloc));
res = Decode2(&state, outStream, inStream, unpackSize);
LzmaDec_Free(&state, &g_Alloc);
return res;
}
static SRes Encode(ISeqOutStream *outStream, ISeqInStream *inStream, UInt64 fileSize, char *rs)
{
CLzmaEncHandle enc;
SRes res;
CLzmaEncProps props;
rs = rs;
enc = LzmaEnc_Create(&g_Alloc);
if (enc == 0)
return SZ_ERROR_MEM;
LzmaEncProps_Init(&props);
res = LzmaEnc_SetProps(enc, &props);
if (res == SZ_OK)
{
Byte header[LZMA_PROPS_SIZE + 8];
size_t headerSize = LZMA_PROPS_SIZE;
int i;
res = LzmaEnc_WriteProperties(enc, header, &headerSize);
for (i = 0; i < 8; i++)
header[headerSize++] = (Byte)(fileSize >> (8 * i));
if (outStream->Write(outStream, header, headerSize) != headerSize)
res = SZ_ERROR_WRITE;
else
{
if (res == SZ_OK)
res = LzmaEnc_Encode(enc, outStream, inStream, NULL, &g_Alloc, &g_Alloc);
}
}
LzmaEnc_Destroy(enc, &g_Alloc, &g_Alloc);
return res;
}
int main2(int numArgs, const char *args[], char *rs)
{
CFileSeqInStream inStream;
CFileOutStream outStream;
char c;
int res;
int encodeMode;
Bool useOutFile = False;
FileSeqInStream_CreateVTable(&inStream);
File_Construct(&inStream.file);
FileOutStream_CreateVTable(&outStream);
File_Construct(&outStream.file);
if (numArgs == 1)
{
PrintHelp(rs);
return 0;
}
if (numArgs < 3 || numArgs > 4 || strlen(args[1]) != 1)
return PrintUserError(rs);
c = args[1][0];
encodeMode = (c == 'e' || c == 'E');
if (!encodeMode && c != 'd' && c != 'D')
return PrintUserError(rs);
{
size_t t4 = sizeof(UInt32);
size_t t8 = sizeof(UInt64);
if (t4 != 4 || t8 != 8)
return PrintError(rs, "Incorrect UInt32 or UInt64");
}
if (InFile_Open(&inStream.file, args[2]) != 0)
return PrintError(rs, "Can not open input file");
if (numArgs > 3)
{
useOutFile = True;
if (OutFile_Open(&outStream.file, args[3]) != 0)
return PrintError(rs, "Can not open output file");
}
else if (encodeMode)
PrintUserError(rs);
if (encodeMode)
{
UInt64 fileSize;
File_GetLength(&inStream.file, &fileSize);
res = Encode(&outStream.s, &inStream.s, fileSize, rs);
}
else
{
res = Decode(&outStream.s, useOutFile ? &inStream.s : NULL);
}
if (useOutFile)
File_Close(&outStream.file);
File_Close(&inStream.file);
if (res != SZ_OK)
{
if (res == SZ_ERROR_MEM)
return PrintError(rs, kCantAllocateMessage);
else if (res == SZ_ERROR_DATA)
return PrintError(rs, kDataErrorMessage);
else if (res == SZ_ERROR_WRITE)
return PrintError(rs, kCantWriteMessage);
else if (res == SZ_ERROR_READ)
return PrintError(rs, kCantReadMessage);
return PrintErrorNumber(rs, res);
}
return 0;
}
int MY_CDECL main(int numArgs, const char *args[])
{
char rs[800] = { 0 };
int res = main2(numArgs, args, rs);
printf(rs);
return res;
}

View File

@@ -1,44 +0,0 @@
PROG = lzma
CXX = g++
LIB =
RM = rm -f
CFLAGS = -c -O2 -Wall
OBJS = \
LzmaUtil.o \
Alloc.o \
LzFind.o \
LzmaDec.o \
LzmaEnc.o \
7zFile.o \
7zStream.o \
all: $(PROG)
$(PROG): $(OBJS)
$(CXX) -o $(PROG) $(LDFLAGS) $(OBJS) $(LIB) $(LIB2)
LzmaUtil.o: LzmaUtil.c
$(CXX) $(CFLAGS) LzmaUtil.c
Alloc.o: ../Alloc.c
$(CXX) $(CFLAGS) ../Alloc.c
LzFind.o: ../LzFind.c
$(CXX) $(CFLAGS) ../LzFind.c
LzmaDec.o: ../LzmaDec.c
$(CXX) $(CFLAGS) ../LzmaDec.c
LzmaEnc.o: ../LzmaEnc.c
$(CXX) $(CFLAGS) ../LzmaEnc.c
7zFile.o: ../7zFile.c
$(CXX) $(CFLAGS) ../7zFile.c
7zStream.o: ../7zStream.c
$(CXX) $(CFLAGS) ../7zStream.c
clean:
-$(RM) $(PROG) $(OBJS)

Some files were not shown because too many files have changed in this diff Show More