algorithm - Listing(or counting) files in .tar/.tar.gz archives: What is the time complexity? -

- June 15, 2014

what time complexity listing file names in .tar archive? o(size(file)) or o(# of files in archive)?

according wikipedia article ( https://en.wikipedia.org/wiki/tar_(computing) ), tar archives not support random access , in order list file names "one has read through entire archive" , understand o(size(file)). other hand, each file in tar has header specifies length of file, can 1 hdd seek each file in there, in case time complexity o(# of files in archive).

and .tar.gz archives, suppose in case not able perform multiple seeks without uncompressing take o(size(file)) anyway?

depends on storage!

uncompressed tar

for tape archives (you know, "tar"s), linear byte length, in case, because fast-forwarding still linear in time length need fast-forward.

for small files on modern storage: same; don't ask ssd 20 bytes of storage. 4kb @ once; in theory, means pretty instantly skip on 1gb file. in practice, experience tells me doesn't happen; don't know why. me, "next_block_after" function should skip forward. shrugs

compressed tar

yes, in general you'll have uncompress know how long content seek somewhere. don't think there's compression format keeps kind of table "intermediate" lengths speed seeking.

Search This Blog

celery

algorithm - Listing(or counting) files in .tar/.tar.gz archives: What is the time complexity? -

Comments

Post a Comment

Popular posts from this blog

mysql - Dreamhost PyCharm Django Python 3 Launching a Site -

java - Sending SMS with SMSLib and Web Services -

php - What are the best practices for creatiang a "settings" model in Laravel 5? -