TODO

Fri, 19 Oct 2007 22:46:20 -0400

author
brett
date
Fri, 19 Oct 2007 22:46:20 -0400
branch
trunk
changeset 28
4d88f2231d33
parent 25
ef62f2f55eb8
child 31
c3a2760d1c3a
permissions
-rw-r--r--

[svn] Change all the license notices from GPLv2 to GPLv3.

Instead of checking the archive contents, figuring out what to do, and
doing it, instead we now always extract the archive to a private directory,
and then shuffle around the contents appropriately. I expected this to be
a bigger win than my benchmarks have borne out, but I'm sticking with this
strategy because it provides a cleaner separation of responsibilities
between the extractors and the archive type handlers, and also I have to
believe it's a much better way to handle bigger archives -- since we're now
reading it once and not twice.

28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 25
diff changeset
1 We should always extract to a new, temporary directory (except maybe in the
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 25
diff changeset
2 straight decompression case), and then move that directory based on what we
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 25
diff changeset
3 actually want. This has several advantages:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 25
diff changeset
4
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 25
diff changeset
5 * Much easier to check whether or not the archive is a bomb (O(1) operation)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 25
diff changeset
6 * Can find other archives more reliably
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 25
diff changeset
7 * Can set up a direct pipe from a decompressed to the unarchiver, since we're
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 25
diff changeset
8 not interested in reading it multiple times anymore.
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 25
diff changeset
9 * All this should mean x is faster, too.
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 25
diff changeset
10
3
5172456c3588 [svn] Ideas off the top of my head.
brett
parents:
diff changeset
11 Things which I have a use case/anti-use case for:
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 25
diff changeset
12 * CAB extraction.
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 18
diff changeset
13 * Use file to detect the archive type.
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 18
diff changeset
14 * Support lzma compression (http://tukaani.org/lzma/download)
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 18
diff changeset
15 * Support pisi packages (http://paketler.pardus.org.tr/pardus-2007/)
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 18
diff changeset
16 * Steal ideas from <http://martin.ankerl.com/files/e>.
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 18
diff changeset
17 * Figure out what the deal is with strerror. (done?)
13
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 11
diff changeset
18 * Better error messages (file doesn't exist, isn't readable, etc.)
18
1600807a32bd [svn] Add basic documentation, and make this version 3.0.
brett
parents: 13
diff changeset
19 * Consistently raise and handle exceptions.
3
5172456c3588 [svn] Ideas off the top of my head.
brett
parents:
diff changeset
20
5172456c3588 [svn] Ideas off the top of my head.
brett
parents:
diff changeset
21 Things that are generally good:
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
22 * Better tests.
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 4
diff changeset
23 * Better error messages.
3
5172456c3588 [svn] Ideas off the top of my head.
brett
parents:
diff changeset
24
5172456c3588 [svn] Ideas off the top of my head.
brett
parents:
diff changeset
25 Things I think might be good but can't prove:
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 25
diff changeset
26 * Take URLs as arguments.
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
27 * Consider having options about whether or not to make sane directories,
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
28 have tarbomb protection, etc.
3
5172456c3588 [svn] Ideas off the top of my head.
brett
parents:
diff changeset
29 * Use zipfile instead of the zip commands.
5172456c3588 [svn] Ideas off the top of my head.
brett
parents:
diff changeset
30 * Processing from stdin.
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
31 * Extracting control.tar.gz from deb files.
18
1600807a32bd [svn] Add basic documentation, and make this version 3.0.
brett
parents: 13
diff changeset
32 * shar support.

mercurial