erl_tar
Unix 'tar' utility for reading and writing tar archives
The erl_tar
module archives and extract files to and from
a tar file. erl_tar
supports the ustar
format
(IEEE Std 1003.1 and ISO/IEC 9945-1). All modern tar
programs (including GNU tar) can read this format. To ensure that
that GNU tar produces a tar file that erl_tar
can read,
give the --format=ustar
option to GNU tar.
By convention, the name of a tar file should end in ".tar
".
To abide to the convention, you'll need to add ".tar
" yourself
to the name.
Tar files can be created in one operation using the create/2 or create/3 function.
Alternatively, for more control, the open, add/3,4, and close/1 functions can be used.
To extract all files from a tar file, use the extract/1 function. To extract only some files or to be able to specify some more options, use the extract/2 function.
To return a list of the files in a tar file, use either the table/1 or table/2 function. To print a list of files to the Erlang shell, use either the t/1 or tt/1 function.
To convert an error term returned from one of the functions above to a readable message, use the format_error/1 function.
UNICODE SUPPORT
If file:native_name_encoding/0
returns utf8
, path names will be encoded in UTF-8 when
creating tar files and path names will be assumed to be encoded in
UTF-8 when extracting tar files.
If file:native_name_encoding/0
returns latin1
, no translation of path names will be
done.
OTHER STORAGE MEDIA
The erl_ftp
module normally accesses the tar-file on disk using the file module. When other needs arise, there is a way to define your own low-level Erlang functions to perform the writing and reading on the storage media. See init/3 for usage.
An example of this is the sftp support in ssh_sftp:open_tar/3. That function opens a tar file on a remote machine using an sftp channel.
LIMITATIONS
For maximum compatibility, it is safe to archive files with names
up to 100 characters in length. Such tar files can generally be
extracted by any tar
program.
If filenames exceed 100 characters in length, the resulting tar
file can only be correctly extracted by a POSIX-compatible tar
program (such as Solaris tar
), not by GNU tar.
File have longer names than 256 bytes cannot be stored at all.
The filename of the file a symbolic link points is always limited to 100 characters.
Functions
add(TarDescriptor, Filename, Options) -> RetValue
TarDescriptor = term()
Filename = filename()
Options = [Option]
Option = dereference|verbose|{chunks,ChunkSize}
ChunkSize = positive_integer()
RetValue = ok|{error,{Filename,Reason}}
Reason = term()
The add/3
function adds
a file to a tar file that has been opened for writing by
open/1.
dereference
By default, symbolic links will be stored as symbolic links
in the tar file. Use the dereference
option to override the
default and store the file that the symbolic link points to into
the tar file.
verbose
Print an informational message about the file being added.
{chunks,ChunkSize}
Read data in parts from the file. This is intended for memory-limited machines that for example builds a tar file on a remote machine over sftp.
add(TarDescriptor, FilenameOrBin, NameInArchive, Options) -> RetValue
TarDescriptor = term()
FilenameOrBin = filename()|binary()
Filename = filename()
NameInArchive = filename()
Options = [Option]
Option = dereference|verbose
RetValue = ok|{error,{Filename,Reason}}
Reason = term()
close(TarDescriptor)
TarDescriptor = term()
The close/1
function
closes a tar file
opened by open/1.
create(Name, FileList) ->RetValue
Name = filename()
FileList = [Filename|{NameInArchive, binary()},{NameInArchive, Filename}]
Filename = filename()
NameInArchive = filename()
RetValue = ok|{error,{Name,Reason}}
Reason = term()
The create/2
function
creates a tar file and
archives the files whose names are given in FileList
into it.
The files may either be read from disk or given as
binaries.
create(Name, FileList, OptionList)
Name = filename()
FileList = [Filename|{NameInArchive, binary()},{NameInArchive, Filename}]
Filename = filename()
NameInArchive = filename()
OptionList = [Option]
Option = compressed|cooked|dereference|verbose
RetValue = ok|{error,{Name,Reason}}
Reason = term()
The create/3
function
creates a tar file and archives the files whose names are given
in FileList
into it. The files may either be read from
disk or given as binaries.
The options in OptionList
modify the defaults as follows.
compressed
The entire tar file will be compressed, as if it has
been run through the gzip
program. To abide to the
convention that a compressed tar file should end in ".tar.gz
" or
".tgz
", you'll need to add the appropriate extension yourself.
cooked
By default, the open/2
function will open the tar file
in raw
mode, which is faster but does not allow a remote (erlang)
file server to be used. Adding cooked
to the mode list will
override the default and open the tar file without the raw
option.
dereference
By default, symbolic links will be stored as symbolic links
in the tar file. Use the dereference
option to override the
default and store the file that the symbolic link points to into
the tar file.
verbose
Print an informational message about each file being added.
extract(Name) -> RetValue
Name = filename()
RetValue = ok|{error,{Name,Reason}}
Reason = term()
The extract/1
function
extracts all files from a tar archive.
If the Name
argument is given as "{binary,Binary}
",
the contents of the binary is assumed to be a tar archive.
If the Name
argument is given as "{file,Fd}
",
Fd
is assumed to be a file descriptor returned from
the file:open/2
function.
Otherwise, Name
should be a filename.
extract(Name, OptionList)
Name = filename() | {binary,Binary} | {file,Fd}
Binary = binary()
Fd = file_descriptor()
OptionList = [Option]
Option = {cwd,Cwd}|{files,FileList}|keep_old_files|verbose|memory
Cwd = [dirname()]
FileList = [filename()]
RetValue = ok|MemoryRetValue|{error,{Name,Reason}}
MemoryRetValue = {ok, [{NameInArchive,binary()}]}
NameInArchive = filename()
Reason = term()
The extract/2
function
extracts files from a tar archive.
If the Name
argument is given as "{binary,Binary}
",
the contents of the binary is assumed to be a tar archive.
If the Name
argument is given as "{file,Fd}
",
Fd
is assumed to be a file descriptor returned from
the file:open/2
function.
Otherwise, Name
should be a filename.
The following options modify the defaults for the extraction as follows.
{cwd,Cwd}
Files with relative filenames will by default be extracted
to the current working directory.
Given the {cwd,Cwd}
option, the extract/2
function
will extract into the directory Cwd
instead of to the current
working directory.
{files,FileList}
By default, all files will be extracted from the tar file.
Given the {files,Files}
option, the extract/2
function
will only extract the files whose names are included in FileList
.
compressed
Given the compressed
option, the extract/2
function will uncompress the file while extracting
If the tar file is not actually compressed, the compressed
will effectively be ignored.
cooked
By default, the open/2
function will open the tar file
in raw
mode, which is faster but does not allow a remote (erlang)
file server to be used. Adding cooked
to the mode list will
override the default and open the tar file without the raw
option.
memory
Instead of extracting to a directory, the memory option will give the result as a list of tuples {Filename, Binary}, where Binary is a binary containing the extracted data of the file named Filename in the tar file.
keep_old_files
By default, all existing files with the same name as file in
the tar file will be overwritten
Given the keep_old_files
option, the extract/2
function
will not overwrite any existing files.
verbose
Print an informational message as each file is being extracted.
format_error(Reason) -> string()
Reason = term()
The format_error/1
function converts
an error reason term to a human-readable error message string.
open(Name, OpenModeList) -> RetValue
Name = filename()
OpenModeList = [OpenMode]
Mode = write|compressed|cooked
RetValue = {ok,TarDescriptor}|{error,{Name,Reason}}
TarDescriptor = term()
Reason = term()
The open/2
function creates
a tar file for writing.
(Any existing file with the same name will be truncated.)
By convention, the name of a tar file should end in ".tar
".
To abide to the convention, you'll need to add ".tar
" yourself
to the name.
Except for the write
atom the following atoms
may be added to OpenModeList
:
compressed
The entire tar file will be compressed, as if it has
been run through the gzip
program. To abide to the
convention that a compressed tar file should end in ".tar.gz
" or
".tgz
", you'll need to add the appropriate extension yourself.
cooked
By default, the open/2
function will open the tar file
in raw
mode, which is faster but does not allow a remote (erlang)
file server to be used. Adding cooked
to the mode list will
override the default and open the tar file without the raw
option.
Use the add/3,4 functions to add one file at the time into an opened tar file. When you are finished adding files, use the close function to close the tar file.
Warning!
The TarDescriptor
term is not a file descriptor.
You should not rely on the specific contents of the TarDescriptor
term, as it may change in future versions as more features are added
to the erl_tar
module.
init(UserPrivate, AccessMode, Fun) -> {ok,TarDescriptor} | {error,Reason}
UserPrivate = term()
AccessMode = [write] | [read]
Fun when AccessMode is [write] = fun(write, {UserPrivate,DataToWrite})->...; (position,{UserPrivate,Position})->...; (close, UserPrivate)->... end
Fun when AccessMode is [read] = fun(read2, {UserPrivate,Size})->...; (position,{UserPrivate,Position})->...; (close, UserPrivate)->... end
TarDescriptor = term()
Reason = term()
The Fun
is the definition of what to do when the different
storage operations functions are to be called from the higher tar
handling functions (add/3
, add/4
, close/1
...).
The Fun
will be called when the tar function wants to do
a low-level operation, like writing a block to a file. The Fun is called
as Fun(Op,{UserPrivate,Parameters...})
where Op
is the operation name,
UserPrivate
is the term passed as the first argument to init/1
and
Parameters...
are the data added by the tar function to be passed down to
the storage handling function.
The parameter UserPrivate
is typically the result of opening a low level
structure like a file descriptor, a sftp channel id or such. The different Fun
clauses operates on that very term.
The fun clauses parameter lists are:
(write, {UserPrivate,DataToWrite})
DataToWrite
using UserPrivate
(close, UserPrivate)
(read2, {UserPrivate,Size})
UserPrivate
but only Size
bytes. Note that there is
only an arity-2 read function, not an arity-1
(position,{UserPrivate,Position})
UserPrivate
as defined for files in file:position/2
A complete Fun
parameter for reading and writing on files using the
file module could be:
ExampleFun = fun(write, {Fd,Data}) -> file:write(Fd, Data); (position, {Fd,Pos}) -> file:position(Fd, Pos); (read2, {Fd,Size}) -> file:read(Fd,Size); (close, Fd) -> file:close(Fd) end
where Fd
was given to the init/3
function as:
{ok,Fd} = file:open(Name,...).
{ok,TarDesc} = erl_tar:init(Fd, [write], ExampleFun),
The TarDesc
is then used:
erl_tar:add(TarDesc, SomeValueIwantToAdd, FileNameInTarFile),
....,
erl_tar:close(TarDesc)
When the erl_tar core wants to e.g. write a piece of Data, it would call
ExampleFun(write,{UserPrivate,Data})
.
Note!
The example above with file
module operations is not necessary to
use directly since that is what the open function
in principle does.
Warning!
The TarDescriptor
term is not a file descriptor.
You should not rely on the specific contents of the TarDescriptor
term, as it may change in future versions as more features are added
to the erl_tar
module.
table(Name) -> RetValue
Name = filename()
RetValue = {ok,[string()]}|{error,{Name,Reason}}
Reason = term()
The table/1
function
retrieves the names of all files in the tar file Name
.
table(Name, Options)
Name = filename()
The table/2
function
retrieves the names of all files in the tar file Name
.
t(Name)
Name = filename()
The t/1
function prints the names
of all files in the tar file Name
to the Erlang shell.
(Similar to "tar t
".)
tt(Name)
Name = filename()
The tt/1
function prints
names and
information about all files in the tar file Name
to
the Erlang shell. (Similar to "tar tv
".)