Open
Description
fuse-archive uses archive_read_data()
to get archive content. Gaps are filled with nulls and fuse-archive has no idea about them.
It seems that operations on big sparse files could be improved.
% tar xvf test/data/sparse.tar
sparse
% time cp sparse sparse.copy
cp sparse sparse.copy 0.00s user 0.00s system 64% cpu 0.002 total
% time cp sparse sparse.copy
cp sparse sparse.copy 0.00s user 0.00s system 63% cpu 0.001 total
% du sparse.copy
4 sparse.copy
% out/fuse-archive test/data/sparse.tar mnt
fuse-archive: Created mount point 'mnt'
% time cp mnt/sparse sparse.copy
cp mnt/sparse sparse.copy 0.00s user 0.36s system 47% cpu 0.735 total
% time cp mnt/sparse sparse.copy
cp mnt/sparse sparse.copy 0.00s user 0.44s system 52% cpu 0.839 total
% du sparse.copy
1048576 sparse.copy
For some reason, the first invocation on the file inside the mounted archive is faster than the following ones.
For the simple file, the fact that the second invocation is slightly faster is probably due to the kernel cache.
With fuse-archive -o kernel_cache
, the second invocation is faster as well:
% out/fuse-archive -o kernel_cache test/data/sparse.tar mnt
fuse-archive: Created mount point 'mnt'
% time cp mnt/sparse sparse.copy
cp mnt/sparse sparse.copy 0.00s user 0.34s system 46% cpu 0.738 total
% time cp mnt/sparse sparse.copy
cp mnt/sparse sparse.copy 0.00s user 0.17s system 34% cpu 0.491 total
Using directly archive_read_data_block()
would bring some benefits, such as:
- support SEEK_HOLE and SEEK_DATA (through FUSE_LSEEK)
- more efficient read operation with tools that support sparseness (coreutils, database, VM, etc)
- possibly more efficient sequential read operation in general on big sparse files (probably not, the zeros would have to be put in memory by fuse-archive instead of libarchive, but they would be there anyway)
- report
st_blocks
that would mean something useful - for some tools, output files would be sparse as well, reducing disk usage and being closer to the original file in the archive.
Metadata
Assignees
Labels
No labels
Activity