scripts/dtrx

Sat, 21 Apr 2007 13:09:58 -0400

author
brett
date
Sat, 21 Apr 2007 13:09:58 -0400
branch
trunk
changeset 22
b240777ae53e
parent 20
69c93c3e6972
child 23
039dd321a7d0
permissions
-rwxr-xr-x

[svn] Improve the way we check archive contents. If all the entries look like
they're in ., they really shouldn't count as being in the same directory;
look at the next piece of the path. If the archive only has one
non-directory item, report that more clearly. You'll be able to tell by
whether or not there's a trailing slash in the prompt.

Improve the tests for doing straight decompression, and seek to the
beginning of the archive before we start writing to the file -- otherwise,
we write 0-byte files.

Lots of new ideas in the TODO. I think I'll do another release once
recursion is interactive.

1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1 #!/usr/bin/env python
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
2 #
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
3 # dtrx -- Intelligently extract various archive types.
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
4 # Copyright (c) 2006 Brett Smith <brettcsmith@brettcsmith.org>.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
5 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
6 # This program is free software; you can redistribute it and/or modify it
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
7 # under the terms of the GNU General Public License as published by the
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
8 # Free Software Foundation; either version 2 of the License, or (at your
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
9 # option) any later version.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
10 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
11 # This program is distributed in the hope that it will be useful, but
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
12 # WITHOUT ANY WARRANTY; without even the implied warranty of
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
13 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
14 # Public License for more details.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
15 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
16 # You should have received a copy of the GNU General Public License along
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
17 # with this program; if not, write to the Free Software Foundation, Inc.,
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
18 # 51 Franklin Street, 5th Floor, Boston, MA, 02111.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
19
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
20 import errno
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
21 import logging
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
22 import mimetypes
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
23 import optparse
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
24 import os
15
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
25 import stat
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
26 import subprocess
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
27 import sys
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
28 import tempfile
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
29 import textwrap
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
30
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
31 from cStringIO import StringIO
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
32
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
33 VERSION = "4.0"
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
34 VERSION_BANNER = """dtrx version %s
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
35 Copyright (c) 2006 Brett Smith <brettcsmith@brettcsmith.org>
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
36
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
37 This program is free software; you can redistribute it and/or modify it
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
38 under the terms of the GNU General Public License as published by the
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
39 Free Software Foundation; either version 2 of the License, or (at your
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
40 option) any later version.
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
41
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
42 This program is distributed in the hope that it will be useful, but
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
43 WITHOUT ANY WARRANTY; without even the implied warranty of
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
44 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
45 Public License for more details.""" % (VERSION,)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
46
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
47 MATCHING_DIRECTORY = 1
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
48 ONE_ENTRY = 2
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
49 BOMB = 3
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
50 EMPTY = 4
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
51 ONE_ENTRY_KNOWN = 5
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
52
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
53 EXTRACT_HERE = 1
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
54 EXTRACT_WRAP = 2
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
55 EXTRACT_RENAME = 3
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
56
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
57 mimetypes.encodings_map.setdefault('.bz2', 'bzip2')
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
58 mimetypes.types_map['.exe'] = 'application/x-msdos-program'
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
59
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
60 def run_command(command, description, stdout=None, stderr=None, stdin=None):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
61 process = subprocess.Popen(command, stdin=stdin, stdout=stdout,
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
62 stderr=stderr)
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
63 status = process.wait()
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
64 for pipe in (process.stdout, process.stderr):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
65 try:
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
66 pipe.close()
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
67 except AttributeError:
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
68 pass
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
69 if status != 0:
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
70 return ("%s error: '%s' returned status code %s" %
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
71 (description, ' '.join(command), status))
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
72 return None
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
73
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
74 class FilenameChecker(object):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
75 def __init__(self, original_name):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
76 self.original_name = original_name
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
77
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
78 def is_free(self, filename):
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
79 return not os.path.exists(filename)
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
80
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
81 def check(self):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
82 for suffix in [''] + ['.%s' % (x,) for x in range(1, 10)]:
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
83 filename = '%s%s' % (self.original_name, suffix)
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
84 if self.is_free(filename):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
85 return filename
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
86 raise ValueError("all alternatives for name %s taken" %
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
87 (self.original_name,))
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
88
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
89
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
90 class DirectoryChecker(FilenameChecker):
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
91 def is_free(self, filename):
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
92 try:
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
93 os.mkdir(filename)
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
94 except OSError, error:
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
95 if error.errno == errno.EEXIST:
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
96 return False
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
97 raise
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
98 return True
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
99
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
100
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
101 class ExtractorError(Exception):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
102 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
103
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
104
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
105 class ProcessStreamer(object):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
106 def __init__(self, command, stdin, description="checking contents",
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
107 stderr=None):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
108 self.process = subprocess.Popen(command, bufsize=1, stdin=stdin,
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
109 stdout=subprocess.PIPE, stderr=stderr)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
110 self.command = ' '.join(command)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
111 self.description = description
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
112
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
113 def __iter__(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
114 return self
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
115
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
116 def next(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
117 line = self.process.stdout.readline()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
118 if line:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
119 return line.rstrip('\n')
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
120 else:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
121 raise StopIteration
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
122
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
123 def stop(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
124 while self.process.stdout.readline():
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
125 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
126 self.process.stdout.close()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
127 status = self.process.wait()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
128 if status != 0:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
129 raise ExtractorError("%s error: '%s' returned status code %s" %
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
130 (self.description, self.command, status))
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
131 try:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
132 self.process.stderr.close()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
133 except AttributeError:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
134 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
135
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
136
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
137 class BaseExtractor(object):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
138 decoders = {'bzip2': 'bzcat', 'gzip': 'zcat', 'compress': 'zcat'}
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
139
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
140 name_checker = DirectoryChecker
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
141
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
142 def __init__(self, filename, mimetype, encoding):
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
143 if encoding and (not self.decoders.has_key(encoding)):
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
144 raise ValueError("unrecognized encoding %s" % (encoding,))
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
145 self.filename = os.path.realpath(filename)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
146 self.mimetype = mimetype
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
147 self.encoding = encoding
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
148 self.included_archives = []
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
149 try:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
150 self.archive = open(filename, 'r')
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
151 except (IOError, OSError), error:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
152 raise ExtractorError("could not open %s: %s" %
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
153 (filename, error.strerror))
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
154 if encoding:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
155 self.pipe([self.decoders[encoding]], "decoding")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
156 self.prepare()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
157
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
158 def run(self, command, description="extraction", stdout=None, stderr=None,
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
159 stdin=None):
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
160 error = run_command(command, description, stdout, stderr, stdin)
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
161 if error:
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
162 raise ExtractorError(error)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
163
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
164 def pipe(self, command, description, stderr=None):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
165 output = tempfile.TemporaryFile()
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
166 self.run(command, description, output, stderr, self.archive)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
167 self.archive.close()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
168 self.archive = output
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
169 self.archive.flush()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
170
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
171 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
172 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
173
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
174 def check_included_archive(self, filename):
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
175 if extractor_map.has_key(mimetypes.guess_type(filename)[0]):
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
176 self.included_archives.append(filename)
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
177
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
178 def check_first_filename(self, filenames):
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
179 try:
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
180 first_filename = filenames.next()
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
181 except StopIteration:
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
182 filenames.stop()
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
183 return (None, None)
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
184 self.check_included_archive(first_filename)
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
185 parts = first_filename.split('/')
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
186 first_part = [parts[0]]
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
187 if parts[0] == '.':
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
188 first_part.append(parts[1])
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
189 return (first_filename, '/'.join(first_part + ['']))
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
190
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
191 def check_second_filename(self, filenames, first_part, first_filename):
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
192 try:
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
193 filename = filenames.next()
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
194 except StopIteration:
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
195 return ONE_ENTRY, first_filename
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
196 self.check_included_archive(filename)
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
197 if not filename.startswith(first_part):
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
198 return BOMB, None
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
199 return None, first_part
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
200
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
201 def check_contents(self):
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
202 filenames = self.get_filenames()
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
203 first_filename, first_part = self.check_first_filename(filenames)
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
204 if first_filename is None:
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
205 return (EMPTY, None)
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
206 archive_type, type_info = self.check_second_filename(filenames,
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
207 first_part,
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
208 first_filename)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
209 for filename in filenames:
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
210 self.check_included_archive(filename)
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
211 if (archive_type != BOMB) and (not filename.startswith(first_part)):
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
212 archive_type = BOMB
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
213 type_info = None
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
214 filenames.stop()
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
215 if archive_type is None:
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
216 if self.basename() == first_part[:-1]:
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
217 archive_type = MATCHING_DIRECTORY
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
218 else:
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
219 archive_type = ONE_ENTRY
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
220 return archive_type, type_info
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
221
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
222 def basename(self):
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
223 pieces = os.path.basename(self.filename).split('.')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
224 extension = '.' + pieces[-1]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
225 if mimetypes.encodings_map.has_key(extension):
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
226 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
227 extension = '.' + pieces[-1]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
228 if (mimetypes.types_map.has_key(extension) or
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
229 mimetypes.common_types.has_key(extension) or
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
230 mimetypes.suffix_map.has_key(extension)):
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
231 pieces.pop()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
232 return '.'.join(pieces)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
233
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
234 def extract(self, path):
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
235 old_path = os.path.realpath(os.curdir)
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
236 os.chdir(path)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
237 self.archive.seek(0, 0)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
238 self.extract_archive()
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
239 os.chdir(old_path)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
240
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
241
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
242 class TarExtractor(BaseExtractor):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
243 def get_filenames(self):
15
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
244 self.archive.seek(0, 0)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
245 return ProcessStreamer(['tar', '-t'], self.archive)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
246
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
247 def extract_archive(self):
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
248 self.run(['tar', '-x'], stdin=self.archive)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
249
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
250
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
251 class ZipExtractor(BaseExtractor):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
252 def __init__(self, filename, mimetype, encoding):
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
253 self.filename = os.path.realpath(filename)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
254 self.mimetype = mimetype
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
255 self.encoding = encoding
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
256 self.included_archives = []
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
257 self.archive = StringIO()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
258
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
259 def get_filenames(self):
15
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
260 self.archive.seek(0, 0)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
261 return ProcessStreamer(['zipinfo', '-1', self.filename], None)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
262
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
263 def extract_archive(self):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
264 self.run(['unzip', '-q', self.filename])
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
265
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
266
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
267 class CpioExtractor(BaseExtractor):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
268 def get_filenames(self):
15
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
269 self.archive.seek(0, 0)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
270 return ProcessStreamer(['cpio', '-t'], self.archive,
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
271 stderr=subprocess.PIPE)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
272
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
273 def extract_archive(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
274 self.run(['cpio', '-i', '--make-directories',
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
275 '--no-absolute-filenames'],
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
276 stderr=subprocess.PIPE, stdin=self.archive)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
277
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
278
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
279 class RPMExtractor(CpioExtractor):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
280 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
281 self.pipe(['rpm2cpio', '-'], "rpm2cpio")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
282
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
283 def basename(self):
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
284 pieces = os.path.basename(self.filename).split('.')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
285 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
286 return pieces[0]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
287 elif pieces[-1] != 'rpm':
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
288 return BaseExtractor.basename(self)
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
289 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
290 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
291 return pieces[0]
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
292 elif len(pieces[-1]) < 8:
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
293 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
294 return '.'.join(pieces)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
295
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
296 def check_contents(self):
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
297 CpioExtractor.check_contents(self)
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
298 return (BOMB, None)
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
299
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
300
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
301 class DebExtractor(TarExtractor):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
302 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
303 self.pipe(['ar', 'p', self.filename, 'data.tar.gz'],
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
304 "data.tar.gz extraction")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
305 self.archive.seek(0, 0)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
306 self.pipe(['zcat'], "data.tar.gz decompression")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
307
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
308 def basename(self):
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
309 pieces = os.path.basename(self.filename).split('_')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
310 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
311 return pieces[0]
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
312 last_piece = pieces.pop()
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
313 if (len(last_piece) > 10) or (not last_piece.endswith('.deb')):
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
314 return BaseExtractor.basename(self)
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
315 return '_'.join(pieces)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
316
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
317 def check_contents(self):
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
318 TarExtractor.check_contents(self)
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
319 return (BOMB, None)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
320
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
321
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
322 class CompressionExtractor(BaseExtractor):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
323 name_checker = FilenameChecker
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
324
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
325 def basename(self):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
326 pieces = os.path.basename(self.filename).split('.')
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
327 extension = '.' + pieces[-1]
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
328 if mimetypes.encodings_map.has_key(extension):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
329 pieces.pop()
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
330 return '.'.join(pieces)
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
331
15
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
332 def get_filenames(self):
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
333 yield self.basename()
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
334
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
335 def check_contents(self):
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
336 return (ONE_ENTRY_KNOWN, self.basename())
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
337
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
338 def extract(self, path):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
339 output = open(path, 'w')
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
340 self.archive.seek(0, 0)
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
341 self.run(['cat'], "output write", stdin=self.archive, stdout=output)
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
342 output.close()
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
343
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
344
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
345 class BaseHandler(object):
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
346 def __init__(self, extractor, contents, content_name, options):
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
347 self.logger = logging.getLogger('dtrx-log')
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
348 self.extractor = extractor
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
349 self.contents = contents
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
350 self.content_name = content_name
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
351 self.options = options
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
352 self.target = None
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
353
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
354 def extract(self):
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
355 try:
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
356 self.extractor.extract(self.target)
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
357 except (ExtractorError, IOError, OSError), error:
15
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
358 return str(error)
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
359
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
360 def cleanup(self):
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
361 if self.target is None:
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
362 return
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
363 command = 'find'
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
364 status = subprocess.call(['find', self.target, '-type', 'd',
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
365 '-exec', 'chmod', 'u+rwx', '{}', ';'])
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
366 if status == 0:
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
367 command = 'chmod'
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
368 status = subprocess.call(['chmod', '-R', 'u+rw', self.target])
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
369 if status != 0:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
370 return "%s returned with exit status %s" % (command, status)
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
371
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
372
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
373 # The "where to extract" table, with options and archive types.
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
374 # This dictates the contents of each can_handle method.
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
375 #
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
376 # Flat Overwrite None
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
377 # File basename basename FilenameChecked
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
378 # Match . . tempdir + checked
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
379 # Bomb . basename DirectoryChecked
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
380
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
381 class FlatHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
382 def can_handle(contents, options):
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
383 return ((options.flat and (contents != ONE_ENTRY_KNOWN)) or
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
384 (options.overwrite and (contents == MATCHING_DIRECTORY)))
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
385 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
386
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
387 def __init__(self, extractor, contents, content_name, options):
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
388 BaseHandler.__init__(self, extractor, contents, content_name, options)
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
389 self.target = '.'
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
390
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
391 def cleanup(self):
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
392 for filename in self.extractor.get_filenames():
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
393 stat_info = os.stat(filename)
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
394 perms = stat.S_IRUSR | stat.S_IWUSR
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
395 if stat.S_ISDIR(stat_info.st_mode):
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
396 perms |= stat.S_IXUSR
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
397 os.chmod(filename, stat_info.st_mode | perms)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
398
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
399
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
400 class OverwriteHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
401 def can_handle(contents, options):
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
402 return ((options.flat and (contents == ONE_ENTRY_KNOWN)) or
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
403 (options.overwrite and (contents != MATCHING_DIRECTORY)))
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
404 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
405
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
406 def __init__(self, extractor, contents, content_name, options):
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
407 BaseHandler.__init__(self, extractor, contents, content_name, options)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
408 self.target = self.extractor.basename()
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
409
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
410
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
411 class MatchHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
412 def can_handle(contents, options):
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
413 return ((contents == MATCHING_DIRECTORY) or
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
414 ((contents == ONE_ENTRY) and
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
415 (options.onedir_policy in (EXTRACT_RENAME, EXTRACT_HERE))))
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
416 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
417
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
418 def extract(self):
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
419 if self.contents == MATCHING_DIRECTORY:
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
420 basename = destination = self.extractor.basename()
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
421 elif self.options.onedir_policy == EXTRACT_HERE:
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
422 basename = destination = self.content_name.rstrip('/')
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
423 else:
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
424 basename = self.content_name.rstrip('/')
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
425 destination = self.extractor.basename()
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
426 self.target = tempdir = tempfile.mkdtemp(dir='.')
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
427 result = BaseHandler.extract(self)
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
428 if result is None:
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
429 checker = self.extractor.name_checker(destination)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
430 self.target = checker.check()
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
431 os.rename(os.path.join(tempdir, basename), self.target)
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
432 os.rmdir(tempdir)
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
433 return result
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
434
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
435
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
436 class EmptyHandler(object):
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
437 def can_handle(contents, options):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
438 return contents == EMPTY
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
439 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
440
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
441 def __init__(self, extractor, contents, content_name, options): pass
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
442 def extract(self): pass
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
443 def cleanup(self): pass
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
444
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
445
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
446 class BombHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
447 def can_handle(contents, options):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
448 return True
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
449 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
450
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
451 def __init__(self, extractor, contents, content_name, options):
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
452 BaseHandler.__init__(self, extractor, contents, content_name, options)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
453 checker = self.extractor.name_checker(self.extractor.basename())
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
454 self.target = checker.check()
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
455
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
456
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
457 extractor_map = {'application/x-tar': TarExtractor,
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
458 'application/zip': ZipExtractor,
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
459 'application/x-msdos-program': ZipExtractor,
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
460 'application/x-debian-package': DebExtractor,
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
461 'application/x-redhat-package-manager': RPMExtractor,
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
462 'application/x-rpm': RPMExtractor,
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
463 'application/x-cpio': CpioExtractor}
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
464
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
465 handlers = [FlatHandler, OverwriteHandler, MatchHandler, EmptyHandler,
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
466 BombHandler]
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
467
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
468 class ExtractorApplication(object):
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
469 policy_answers = {'h': EXTRACT_HERE, 'i': EXTRACT_WRAP,
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
470 'r': EXTRACT_RENAME, '': EXTRACT_WRAP}
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
471
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
472 def __init__(self, arguments):
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
473 self.parse_options(arguments)
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
474 self.setup_logger()
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
475 self.successes = []
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
476 self.failures = []
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
477
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
478 def parse_options(self, arguments):
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
479 parser = optparse.OptionParser(
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
480 usage="%prog [options] archive [archive2 ...]",
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
481 description="Intelligent archive extractor",
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
482 version=VERSION_BANNER
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
483 )
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
484 parser.add_option('-r', '--recursive', dest='recursive',
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
485 action='store_true', default=False,
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
486 help='extract archives contained in the ones listed')
13
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
487 parser.add_option('-q', '--quiet', dest='quiet',
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
488 action='count', default=3,
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
489 help='suppress warning/error messages')
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
490 parser.add_option('-v', '--verbose', dest='verbose',
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
491 action='count', default=0,
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
492 help='be verbose/print debugging information')
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
493 parser.add_option('-o', '--overwrite', dest='overwrite',
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
494 action='store_true', default=False,
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
495 help='overwrite any existing target directory')
15
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
496 parser.add_option('-f', '--flat', '--no-directory', dest='flat',
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
497 action='store_true', default=False,
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
498 help="don't put contents in their own directory")
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
499 parser.add_option('-l', '-t', '--list', '--table', dest='show_list',
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
500 action='store_true', default=False,
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
501 help="list contents of archives on standard output")
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
502 parser.add_option('-n', '--noninteractive', dest='batch',
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
503 action='store_true', default=False,
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
504 help="don't ask how to handle special cases")
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
505 self.options, filenames = parser.parse_args(arguments)
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
506 self.options.onedir_policy = self.policy_answers['']
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
507 if not filenames:
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
508 parser.error("you did not list any archives")
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
509 self.archives = {os.path.realpath(os.curdir): filenames}
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
510
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
511 def setup_logger(self):
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
512 self.logger = logging.getLogger('dtrx-log')
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
513 handler = logging.StreamHandler()
13
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
514 # WARNING is the default.
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
515 handler.setLevel(10 * (self.options.quiet - self.options.verbose))
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
516 formatter = logging.Formatter("dtrx: %(levelname)s: %(message)s")
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
517 handler.setFormatter(formatter)
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
518 self.logger.addHandler(handler)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
519
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
520 def ask_question(self, question, answers):
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
521 if self.options.batch:
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
522 return answers['']
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
523 last_line = question.pop()
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
524 while True:
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
525 print "\n".join(question)
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
526 try:
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
527 answer = raw_input(last_line)
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
528 except EOFError:
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
529 return answers['']
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
530 try:
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
531 return answers[answer.lower()]
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
532 except KeyError:
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
533 print
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
534
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
535 def get_extractor(self):
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
536 mimetype, encoding = mimetypes.guess_type(self.current_filename)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
537 try:
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
538 extractor = extractor_map[mimetype]
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
539 except KeyError:
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
540 if encoding:
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
541 extractor = CompressionExtractor
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
542 else:
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
543 return "not a known archive type"
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
544 try:
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
545 self.current_extractor = extractor(self.current_filename, mimetype,
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
546 encoding)
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
547 except ExtractorError, error:
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
548 return str(error)
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
549
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
550 def get_handler(self):
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
551 try:
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
552 content, content_name = self.current_extractor.check_contents()
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
553 if content == ONE_ENTRY:
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
554 question = textwrap.wrap("%s contains one entry: %s." %
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
555 (self.current_filename, content_name))
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
556 question.extend(["You can:",
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
557 " * extract it Inside another directory",
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
558 " * extract it and Rename the directory",
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
559 " * extract it Here",
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
560 "What do you want to do? (I/r/h) "])
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
561 self.options.onedir_policy = \
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
562 self.ask_question(question, self.policy_answers)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
563 for handler in handlers:
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
564 if handler.can_handle(content, self.options):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
565 self.current_handler = handler(self.current_extractor,
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
566 content, content_name,
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
567 self.options)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
568 break
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
569 except ExtractorError, error:
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
570 return str(error)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
571
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
572 def recurse(self):
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
573 if not self.options.recursive:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
574 return
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
575 for filename in self.current_extractor.included_archives:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
576 tail_path, basename = os.path.split(filename)
10
f0acfe12a0e2 [svn] Add tests for the case where we do recursive extraction of an archive
brett
parents: 9
diff changeset
577 directory = os.path.join(self.current_directory,
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
578 self.current_handler.target, tail_path)
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
579 self.archives.setdefault(directory, []).append(basename)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
580
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
581 def report(self, function, *args):
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
582 try:
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
583 error = function(*args)
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
584 except (ExtractorError, IOError, OSError), exception:
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
585 error = str(exception)
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
586 if error:
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
587 self.logger.error("%s: %s", self.current_filename, error)
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
588 return False
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
589 return True
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
590
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
591 def record_status(self, success):
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
592 if success:
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
593 self.successes.append(self.current_filename)
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
594 else:
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
595 self.failures.append(self.current_filename)
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
596
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
597 def extract(self):
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
598 while self.archives:
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
599 self.current_directory, filenames = self.archives.popitem()
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
600 for filename in filenames:
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
601 os.chdir(self.current_directory)
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
602 self.current_filename = filename
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
603 success = (self.report(self.get_extractor) and
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
604 self.report(self.get_handler))
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
605 if success:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
606 for name in 'extract', 'cleanup':
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
607 success = (self.report(getattr(self.current_handler,
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
608 name)) and success)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
609 self.recurse()
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
610 self.record_status(success)
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
611
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
612 def show_contents(self):
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
613 for filename in self.current_extractor.get_filenames():
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
614 print filename
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
615
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
616 def show_list(self):
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
617 filenames = self.archives.values()[0]
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
618 if len(filenames) > 1:
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
619 header = "%s:\n"
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
620 else:
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
621 header = None
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
622 for filename in filenames:
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
623 if header:
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
624 print header % (filename,),
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
625 header = "\n%s:\n"
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
626 self.current_filename = filename
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
627 success = (self.report(self.get_extractor) and
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
628 self.report(self.show_contents))
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
629 self.record_status(success)
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
630
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
631 def run(self):
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
632 if self.options.show_list:
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
633 self.show_list()
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
634 else:
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
635 self.extract()
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
636 if self.failures:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
637 return 1
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
638 return 0
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
639
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
640
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
641 if __name__ == '__main__':
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
642 app = ExtractorApplication(sys.argv[1:])
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
643 sys.exit(app.run())

mercurial