1
0
mirror of https://github.com/upx/upx synced 2025-09-28 19:06:07 +08:00
upx/doc/upx.pod
Markus F.X.J. Oberhumer fa772703d4 Prepared for 1.10 release.
committer: mfx <mfx> 977233750 +0000
2000-12-19 13:49:10 +00:00

806 lines
26 KiB
Plaintext

=head1 NAME
upx - compress or expand executable files
=head1 SYNOPSIS
B<upx> S<[ I<command> ]> S<[ I<options> ]> I<filename>...
=head1 ABSTRACT
The Ultimate Packer for eXecutables
Copyright (c) 1996, 1997, 1998, 1999, 2000
Markus F.X.J. Oberhumer, Laszlo Molnar & John F. Reiser
http://wildsau.idv.uni-linz.ac.at/mfx/upx.html
http://upx.tsx.org
B<UPX> is a portable, extendable, high-performance executable packer for
several different executable formats. It achieves an excellent compression
ratio and offers I<*very*> fast decompression. Your executables suffer
no memory overhead or other drawbacks for most of the formats supported.
While you may use UPX freely for both non-commercial and commercial
executables (for details see the file LICENSE), we would highly
appreciate if you credit UPX and ourselves in the documentation,
possibly including a reference to the UPX home page. Thanks.
[ Using UPX in non-OpenSource applications without proper credits
is considered not politically correct ;-) ]
=head1 DISCLAIMER
UPX comes with ABSOLUTELY NO WARRANTY; for details see the file LICENSE.
Having said that, we think that UPX is quite stable now. Indeed we
have compressed lots of files without any problems. Also, the
current version has undergone several months of beta testing -
actually it's more than 2 1/2 years since our first public beta.
This is the first production quality release, and we plan that future 1.xx
releases will be backward compatible with this version.
Please report all problems or suggestions to the authors. Thanks.
=head1 DESCRIPTION
B<UPX> is a versatile executable packer with the following features:
- excellent compression ratio: compresses better than zip/gzip,
use UPX to decrease the size of your distribution !
- very fast decompression: about 10 MB/sec even on my old Pentium 133
- no memory overhead for your compressed executables for most of the
supported formats
- safe: you can list, test and unpack your executables
Also, a checksum of both the compressed and uncompressed file is
maintained internally.
- universal: UPX can pack a number of executable formats:
* atari/tos
* bvmlinuz/386 [bootable Linux kernel]
* djgpp2/coff
* dos/com
* dos/exe
* dos/sys
* linux/386
* linux/elf386
* linux/sh386
* rtm32/pe
* tmt/adam
* vmlinuz/386 [bootable Linux kernel]
* watcom/le (supporting DOS4G, PMODE/W, DOS32a and CauseWay)
* win32/pe
- portable: UPX is written in portable endian-neutral C++
- extendable: because of the class layout it's very easy to support
new executable formats or add new compression algorithms
- free: UPX can be distributed and used freely. And from version 0.99
the full source code of UPX is released under the GNU General Public
License (GPL) !
You probably understand now why we call UPX the "I<ultimate>"
executable packer.
=head1 COMMANDS
=head2 Compress
This is the default operation, eg. B<upx yourfile.exe> will compress the file
specified on the command line.
=head2 Decompress
All UPX supported file formats can be unpacked using the B<-d> switch, eg.
B<upx -d yourfile.exe> will uncompress the file you've just compressed.
=head2 Test
The B<-t> command tests the integrity of the compressed and uncompressed
data, eg. B<upx -t yourfile.exe> check whether your file can be safely
decompressed. Note, that this command doesn't check the whole file, only
the part that will be uncompressed during program execution. This means
that you should not use this command instead of a virus checker.
=head2 List
The B<-l> command prints out some information about the compressed files
specified on the command line as parameters, eg B<upx -l yourfile.exe>
shows the compressed / uncompressed size and the compression ratio of
I<yourfile.exe>.
=head1 OPTIONS
B<-q>: be quiet, suppress warnings
B<-q -q> (or B<-qq>): be very quiet, suppress errors
B<-q -q -q> (or B<-qqq>): produce no output at all
B<--help>: prints the help
B<--version>: print the version of UPX
B<--stdout>: writes all output to stdout
[ ...to be written... - type `B<upx --help>' for now ]
=head1 COMPRESSION LEVELS & TUNING
B<UPX> offers ten different compression levels from B<-1> to B<-9>,
and B<--best>. The default compression level is B<-8> for files
smaller than 512 kB, and B<-7> otherwise.
=over 4
=item *
Compression levels 1, 2 and 3 are pretty fast.
=item *
Compression levels 4, 5 and 6 achieve a good time/ratio performance.
=item *
Compression levels 7, 8 and 9 favor compression ratio over speed.
=item *
Compression level B<--best> may take a long time.
=back
Note that compression level B<-9> can be somewhat slow for large
files, but you definitely should use it when releasing a final version
of your program.
Since UPX 0.70 there is also an extra compression level B<--best> which
squeezes out even some more compression ratio. While it is usually fine
to use this option with your favorite .com file it may take a long time
to compress a multi-megabyte program. You have been warned.
Tips for even better compression:
=over 4
=item *
Try if B<--overlay=strip> works.
=item *
For win32/pe programs there's B<--strip-relocs=0>. See notes below.
=back
=head1 OVERLAY HANDLING OPTIONS
B<UPX> handles overlays like many other executable packers do: it simply
copies the overlay after the compressed image. This works with some
files, but doesn't work with others.
Since version 0.90 UPX defaults to B<--overlay=copy> for
all executable formats.
--overlay=copy Copy any extra data attached to the file. [DEFAULT]
--overlay=strip Strip any overlay from the program instead of
copying it. Be warned, this may make the compressed
program crash or otherwise unusable.
--overlay=skip Refuse to compress any program which has an overlay.
=head1 ENVIRONMENT
The environment variable B<UPX> can hold a set of default
options for UPX. These options are interpreted first and
can be overwritten by explicit command line parameters.
For example:
for DOS/Windows: set UPX=-9 --compress-icons#1
for sh/ksh/zsh: UPX="-9 --compress-icons=1"; export UPX
for csh/tcsh: setenv UPX "-9 --compress-icons=1"
Under DOS/Windows you must use '#' instead of '=' when setting the
environment variable because of a COMMAND.COM limitation.
On Vax/VMS, the name of the environment variable is
UPX_OPT, to avoid a conflict with the symbol set for
invocation of the program.
Not all of the options are valid in the environment variable -
UPX will tell you.
You can use the B<--no-env> option to turn this support off.
=head1 NOTES FOR THE SUPPORTED EXECUTABLE FORMATS
=head2 NOTES FOR ATARI/TOS
This is the executable format used by the Atari ST/TT, a 68000 based
personal computer which was popular in the late '80s. Support
of this format is only because of nostalgic feelings of one of
the authors and serves no practical purpose :-).
Packed programs will be byte-identical to the original after uncompression.
All debug information will be stripped, though.
Extra options available for this executable format:
(none)
=head2 NOTES FOR BVMLINUZ/I386
Same as vmlinuz/i386.
=head2 NOTES FOR DOS/COM
Obviously UPX won't work with executables that want to read data from
themselves (like some commandline utilities that ship with Win95/98/ME).
Compressed programs only work on a 286+.
Packed programs will be byte-identical to the original after uncompression.
Maximum uncompressed size: ~65100 bytes.
Extra options available for this executable format:
--8086 Create an executable that works on any 8086 CPU.
=head2 NOTES FOR DOS/EXE
dos/exe stands for all "normal" 16-bit DOS executables.
Obviously UPX won't work with executables that want to read data from
themselves (like some command line utilities that ship with Win95/98/ME).
Compressed programs only work on a 286+.
Extra options available for this executable format:
--8086 Create an executable that works on any 8086 CPU.
--no-reloc Use no relocation records in the exe header.
=head2 NOTES FOR DOS/SYS
You can only compress plain sys files, sys/exe (two in one)
combos are not supported.
Compressed programs only work on a 286+.
Packed programs will be byte-identical to the original after uncompression.
Maximum uncompressed size: ~65350 bytes.
Extra options available for this executable format:
--8086 Create an executable that works on any 8086 CPU.
=head2 NOTES FOR DJGPP2/COFF
First of all, it is recommended to use UPX *instead* of B<strip>. strip has
the very bad habit of replacing your stub with its own (outdated) version.
Additionally UPX corrects a bug/feature in strip v2.8.x: it
will fix the 4 KByte aligment of the stub.
UPX includes the full functionality of stubify. This means it will
automatically stubify your COFF files. Use the option B<--coff> to
disable this behaviour (see below).
UPX automatically handles Allegro packfiles.
The DLM format (a rather exotic shared library extension) is not supported.
Packed programs will be byte-identical to the original after uncompression.
All debug information and trailing garbage will be stripped, though.
BTW, UPX is the successor of the DJP executable packer.
Extra options available for this executable format:
--coff Produce COFF output instead of EXE. By default
UPX keeps your current stub.
=head2 NOTES FOR LINUX [general]
Introduction
Linux/386 support in UPX consists of 3 different executable formats,
one optimized for ELF excutables ("linux/elf386"), one optimized
for shell scripts ("linux/sh386"), and one generic format
("linux/386").
We will start with a general discussion first, but please
also read the relevant docs for each of the formats.
Also, there is special support for bootable kernels - see the
description of the vmlinuz/386 format.
General user's overview
Running a compressed executable program trades space on a ``permanent''
storage medium (such as a hard disk, floppy disk, CD-ROM, flash
memory, EPROM, etc.) for space in one or more ``temporary'' storage
media (such as RAM, swap space, /tmp, etc.). Running a compressed
executable also requires some additional CPU cycles to generate
the compressed executable in the first place, and to decompress
it at each invocation.
How much space is traded? It depends on the executable, but many
programs save 30% to 50% of permanent disk space. How much CPU
overhead is there? Again, it depends on the executable, but
decompression speed generally is at least many megabytes per second,
and frequently is limited by the speed of the underlying disk
or network I/O.
Depending on the statistics of usage and access, and the relative
speeds of CPU, RAM, swap space, /tmp, and filesystem storage, then
invoking and running a compressed executable can be faster than
directly running the corresponding uncompressed program.
The operating system might perfrom fewer expensive I/O operations
to invoke the compressed program. Paging to or from swap space
or /tmp might be faster than paging from the general filesystem.
``Medium-sized'' programs which access about 1/3 to 1/2 of their
stored program bytes can do particulary well with compression.
Small programs tend not to benefit as much because the absolute
savings is less. Big programs tend not to benefit proportionally
because each invocation may use only a small fraction of the program,
yet UPX decompresses the entire program before invoking it.
But in environments where disk or flash memory storage is limited,
then compression may win anyway.
Currently, executables compressed by UPX do not share RAM at runtime
in the way that executables mapped from a filesystem do. As a
result, if the same program is run simultaneously by more than one
process, then using the compressed version will require more RAM and/or
swap space. So, shell programs (bash, csh, etc.) and ``make''
might not be good candidates for compression.
UPX recognizes three executable formats for Linux: Linux/elf386,
Linux/sh386, and Linux/386. Linux/386 is the most generic format;
it accommodates any file that can be executed. At runtime, the UPX
decompression stub re-creates in /tmp a copy of the original file,
and then the copy is (re-)executed with the same arguments.
ELF binary executables prefer the Linux/elf386 format by default,
because UPX decompresses them directly into RAM, uses only one
exec, does not use space in /tmp, and does not use /proc.
Shell scripts where the underling shell accepts a ``-c'' argument
can use the Linux/sh386 format. UPX decompresses the shell script
into low memory, then maps the shell and passes the entire text of the
script as an argument with a leading ``-c''.
General benefits:
- UPX can compress all executables, be it AOUT, ELF, libc4, libc5,
libc6, Shell/Perl/Python/... scripts, standalone Java .class
binaries, or whatever...
All scripts and programs will work just as before.
- Compressed programs are completely self-contained. No need for
any external program.
- UPX keeps your original program untouched. This means that
after decompression you will have a byte-identical version,
and you can use UPX as a file compressor just like gzip.
[ Note that UPX maintains a checksum of the file internally,
so it is indeed a reliable alternative. ]
- As the stub only uses syscalls and isn't linked against libc it
should run under any Linux configuration that can run ELF
binaries.
- For the same reason compressed executables should run under
FreeBSD and other systems which can run Linux binaries.
[ Please send feedback on this topic ]
General drawbacks:
- It is not advisable to compress programs which usually have many
instances running (like `sh' or `make') because the common segments of
compressed programs won't be shared any longer between different
processes.
- `ldd' and `size' won't show anything useful because all they
see is the statically linked stub. Since version 0.82 the section
headers are stripped from the UPX stub and `size' doesn't even
recognize the file format. The file patches/patch-elfcode.h has a
patch to fix this bug in `size' and other programs which use GNU BFD.
General notes:
- As UPX leaves your original program untouched it is advantageous
to strip it before compression.
- If you compress a script you will lose platform independence -
this could be a problem if you are using NFS mounted disks.
- Compression of suid, guid and sticky-bit programs is rejected
because of possible security implications.
- For the same reason there is no sense in making any compressed
program suid.
- Obviously UPX won't work with executables that want to read data
from themselves. E.g., this might be a problem for Perl scripts
which access their __DATA__ lines.
- In case of internal errors the stub will abort with exitcode 127.
Typical reasons for this to happen are that the program has somehow
been modified after compression.
Running `strace -o strace.log compressed_file' will tell you more.
=head2 NOTES FOR LINUX/ELF386
Please read the general Linux description first.
The linux/elf386 format decompresses directly into RAM,
uses only one exec, does not use space in /tmp,
and does not use /proc.
Linux/elf386 is automatically selected for Linux ELF exectuables.
Packed programs will be byte-identical to the original after uncompression.
How it works:
For ELF executables, UPX decompresses directly to memory, simulating
the mapping that the operating system kernel uses during exec(),
including the PT_INTERP program interpreter (if any).
The brk() is set by a special PT_LOAD segment in the compressed
executable itself. UPX then wipes the stack clean except for
arguments, environment variables, and Elf_auxv entries (this is
required by bugs in the startup code of /lib/ld-linux.so as of
May 2000), and transfers control to the program interpreter or
the e_entry address of the original executable.
The UPX stub is about 1700 bytes long, partly written in assembler
and only uses kernel syscalls. It is not linked against any libc.
Specific drawbacks:
- For linux/elf386 and linux/sh386 formats, you will be relying on
RAM and swap space to hold all of the decompressed program during
the lifetime of the process. If you already use most of your swap
space, then you may run out. A system that is "out of memory"
can become fragile. Many programs do not react gracefully when
malloc() returns 0. With newer Linux kernels, the kernel
may decide to kill some processes to regain memory, and you
may not like the kernel's choice of which to kill. Running
/usr/bin/top is one way to check on the usage of swap space.
Extra options available for this executable format:
(none)
=head2 NOTES FOR LINUX/SH386
Please read the general Linux description first.
Shell scripts where the underling shell accepts a ``-c'' argument
can use the Linux/sh386 format. UPX decompresses the shell script
into low memory, then maps the shell and passes the entire text of the
script as an argument with a leading ``-c''.
It does not use space in /tmp, and does not use /proc.
Linux/sh386 is automatically selected for shell scripts that
use a known shell.
Packed programs will be byte-identical to the original after uncompression.
How it works:
For shell script executables (files beginning with "#!/" or "#! /")
where the shell is known to accept "-c <command>", UPX decompresses
the file into low memory, then maps the shell (and its PT_INTERP),
and passes control to the shell with the entire decompressed file
as the argument after "-c". Known shells are sh, ash, bsh, csh,
ksh, tcsh, pdksh. Restriction: UPX 1.10 cannot use this method
for shell scripts which use the one optional string argument after
the shell name in the script (example: "#! /bin/sh option3\n".)
The UPX stub is about 1700 bytes long, partly written in assembler
and only uses kernel syscalls. It is not linked against any libc.
Specific drawbacks:
- For linux/elf386 and linux/sh386 formats, you will be relying on
RAM and swap space to hold all of the decompressed program during
the lifetime of the process. If you already use most of your swap
space, then you may run out. A system that is "out of memory"
can become fragile. Many programs do not react gracefully when
malloc() returns 0. With newer Linux kernels, the kernel
may decide to kill some processes to regain memory, and you
may not like the kernel's choice of which to kill. Running
/usr/bin/top is one way to check on the usage of swap space.
Extra options available for this executable format:
(none)
=head2 NOTES FOR LINUX/386
Please read the general Linux description first.
The generic linux/386 format deompresses to /tmp
and needs /proc filesystem support.
Linux/386 is only selected if the specialized linux/elf386
and linux/sh386 won't recognize a file.
Packed programs will be byte-identical to the original after uncompression.
How it works:
For files which are not ELF and not a script for a known "-c" shell,
UPX uses kernel exec(), which first requires decompressing to a
temporary file in the filesystem. Interestingly -
because of the good memory management of the Linux kernel - this
often does not introduce a noticable delay, and in fact there
will be no disk access at all if you have enough free memory as
the entire process takes places within the filesystem buffers.
A compressed executable consists of the UPX stub and an overlay
which contains the original program in a compressed form.
The UPX stub is a statically linked ELF executable and does
the following at program startup:
1) decompress the overlay to a temporary location in /tmp
2) open the temporary file for reading
3) try to delete the temporary file and start (execve)
the uncompressed program in /tmp using /proc/<pid>/fd/X as
attained by step 2)
4) if that fails, fork off a subprocess to clean up and
start the program in /tmp in the meantime
The UPX stub is about 1700 bytes long, partly written in assembler
and only uses kernel syscalls. It is not linked against any libc.
Specific drawbacks:
- You need additional free disk space for the uncompressed program
in your /tmp directory. This program is deleted immediately after
decompression, but you still need it for the full execution time
of the program.
- You must have /proc filesystem support as the stub wants to open
/proc/<pid>/exe and needs /proc/<pid>/fd/X. This also means that you
cannot compress programs that are used during the boot sequence
before /proc is mounted.
- Utilities like `top' will display numerical values in the process
name field. This is because Linux computes the process name from
the first argument of the last execve syscall (which is typically
something like /proc/<pid>/fd/3).
- Because of temporary decompression to disk the decompression speed
is not as fast as with the other executable formats. Still, I can see
no noticable delay when starting programs like my ~3 MB emacs (which
is less than 1 MB when compressed :-).
Extra options available for this executable format:
(none)
=head2 NOTES FOR RTM32/PE
Same as win32/pe.
=head2 NOTES FOR TMT/ADAM
This format is used by the TMT Pascal compiler - see http://www.tmt.com/ .
Extra options available for this executable format:
(none)
=head2 NOTES FOR VMLINUZ/386
The vmlinuz/386 and bvmlinuz/386 formats take a gzip-compressed
bootable kernel image ("vmlinuz", "zImage", "bzImage"), gzip-decompress
it and re-compress it with the UPX compression method.
vmlinuz/386 is completely unrelated to the other Linux executable
formats, and it does not share any of their drawbacks.
Notes:
- Be sure that "vmlinuz/386" or "bmlinuz/386" is displayed
during compression - otherwise a wrong executable format
may have been used, and the kernel won't boot.
Benefits:
- Better compression (but note that the kernel was already compressed,
so the improvement is not as large as with other formats).
Still, the bytes saved may be essential for special needs like
bootdisks.
For example, this is what I get for my 2.2.16 kernel:
1589708 vmlinux
641073 bzImage [original]
560755 bzImage.upx [compressed by "upx -9"]
- Much faster decompression at kernel boot time.
Drawbacks:
(none)
Extra options available for this executable format:
(none)
=head2 NOTES FOR WATCOM/LE
UPX has been successfully tested with the following extenders:
DOS4G, DOS4GW, PMODE/W, DOS32a, CauseWay.
The WDOS/X extender is partly supported (for details
see the file bugs BUGS).
Yes, you can use your compressed executables with DOS4GW.
The LX format is not yet supported.
DLLs are not supported.
Extra options available for this executable format:
--le Produce an unbound LE output instead of
keeping the current stub.
=head2 NOTES FOR WIN32/PE
The PE support in UPX is quite stable now, but probably there are
still some incompabilities with some files.
Because of the way UPX (and other packers for this format) works, you
can see increased memory usage of your compressed files. If you start
several instances of huge compressed programs you're wasting memory
because the common segements of the program won't get shared
across the instances.
On the other hand if you're compressing only smaller programs, or
running only one instance of larger programs, then this penalty is
smaller, but it's still there.
If you're running executables from network, then compressed programs
will load faster, and require less bandwidth during execution.
DLLs are supported.
Extra options available for this executable format:
--compress-exports=0 Don't compress the export section.
Use this if you plan to run the compressed
program under Wine.
--compress-exports=1 Compress the export section. [DEFAULT]
Compression of the export section can improve the
compression ratio quite a bit but may not work
with all programs (like winword.exe).
UPX never compresses the export section of a DLL
regardless of this option.
--compress-icons=0 Don't compress any icons.
--compress-icons=1 Compress all but the first icon.
--compress-icons=2 Compress all icons which are not in the
first icon directory. [DEFAULT]
--compress-resources=0 Don't compress any resources at all.
--force Force compression even when there is an
unexpected value in a header field.
Use with care.
--strip-relocs=0 Don't strip relocation records.
--strip-relocs=1 Strip relocation records. [DEFAULT]
This option only works on executables with base
address greater or equal to 0x400000. Usually the
compressed files becomes smaller, but some files
may become larger. Note that the resulting file will
not work under Win32s.
UPX never strips relocations from a DLL
regardless of this option.
=head1 DIAGNOSTICS
Exit status is normally 0; if an error occurs, exit status
is 1. If a warning occurs, exit status is 2.
B<UPX>'s diagnostics are intended to be self-explanatory.
=head1 BUGS
Please report all bugs immediately to the authors.
=head1 AUTHORS
Markus F.X.J. Oberhumer <markus.oberhumer@jk.uni-linz.ac.at>
http://wildsau.idv.uni-linz.ac.at/mfx/
Laszlo Molnar <ml1050@cdata.tvnet.hu>
John F. Reiser <jreiser@BitWagon.com>
=head1 COPYRIGHT
Copyright (C) 1996-2000 Markus Franz Xaver Johannes Oberhumer
Copyright (C) 1996-2000 Laszlo Molnar
Copyright (C) 2000 John Reiser
This program may be used freely, and you are welcome to
redistribute it under certain conditions.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
UPX License Agreement for more details.
You should have received a copy of the UPX License Agreement along
with this program; see the file LICENSE. If not, visit the UPX home page.