scripts/x

Mon, 13 Nov 2006 23:06:30 -0500

author
brett
date
Mon, 13 Nov 2006 23:06:30 -0500
branch
trunk
changeset 8
97388f5ff770
parent 7
1f3cb3845dfd
child 9
920417b8acc9
permissions
-rwxr-xr-x

[svn] Make ExtractorApplication suck less. Now the strategies for handling
different archive types are out in their own classes, and polymorphism
takes care of everything for us. This is way cleaner.

While I was at it I changed the behavior in the case where an archive
contains one directory that doesn't match the basename. I now treat that
the same as a bomb. This can lead to silly directory structures but
ensures that there's no "data" loss nor unexpected results.

1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1 #!/usr/bin/env python
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
2 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
3 # x -- Intelligently extract various archive types.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
4 # Copyright (c) 2006 Brett Smith <brettcsmith@brettcsmith.org>.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
5 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
6 # This program is free software; you can redistribute it and/or modify it
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
7 # under the terms of the GNU General Public License as published by the
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
8 # Free Software Foundation; either version 2 of the License, or (at your
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
9 # option) any later version.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
10 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
11 # This program is distributed in the hope that it will be useful, but
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
12 # WITHOUT ANY WARRANTY; without even the implied warranty of
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
13 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
14 # Public License for more details.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
15 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
16 # You should have received a copy of the GNU General Public License along
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
17 # with this program; if not, write to the Free Software Foundation, Inc.,
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
18 # 51 Franklin Street, 5th Floor, Boston, MA, 02111.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
19
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
20 import errno
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
21 import mimetypes
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
22 import optparse
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
23 import os
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
24 import subprocess
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
25 import sys
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
26 import tempfile
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
27
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
28 from cStringIO import StringIO
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
29
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
30 VERSION = "1.1"
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
31 VERSION_BANNER = """x version %s
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
32 Copyright (c) 2006 Brett Smith <brettcsmith@brettcsmith.org>
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
33
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
34 This program is free software; you can redistribute it and/or modify it
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
35 under the terms of the GNU General Public License as published by the
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
36 Free Software Foundation; either version 2 of the License, or (at your
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
37 option) any later version.
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
38
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
39 This program is distributed in the hope that it will be useful, but
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
40 WITHOUT ANY WARRANTY; without even the implied warranty of
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
41 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
42 Public License for more details.""" % (VERSION,)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
43
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
44 MATCHING_DIRECTORY = 1
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
45 # ONE_DIRECTORY = 2
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
46 BOMB = 3
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
47 EMPTY = 4
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
48
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
49 mimetypes.encodings_map.setdefault('.bz2', 'bzip2')
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
50 mimetypes.types_map['.exe'] = 'application/x-msdos-program'
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
51
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
52 class ExtractorError(Exception):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
53 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
54
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
55
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
56 class ProcessStreamer(object):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
57 def __init__(self, command, stdin, description="checking contents",
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
58 stderr=None):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
59 self.process = subprocess.Popen(command, bufsize=1, stdin=stdin,
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
60 stdout=subprocess.PIPE, stderr=stderr)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
61 self.command = ' '.join(command)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
62 self.description = description
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
63
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
64 def __iter__(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
65 return self
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
66
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
67 def next(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
68 line = self.process.stdout.readline()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
69 if line:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
70 return line.rstrip('\n')
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
71 else:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
72 raise StopIteration
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
73
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
74 def stop(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
75 while self.process.stdout.readline():
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
76 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
77 self.process.stdout.close()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
78 status = self.process.wait()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
79 if status != 0:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
80 raise ExtractorError("%s error: '%s' returned status code %s" %
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
81 (self.description, self.command, status))
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
82 try:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
83 self.process.stderr.close()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
84 except AttributeError:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
85 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
86
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
87
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
88 class BaseExtractor(object):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
89 decoders = {'bzip2': 'bzcat', 'gzip': 'zcat', 'compress': 'zcat'}
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
90
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
91 def __init__(self, filename, mimetype, encoding):
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
92 if encoding and (not self.decoders.has_key(encoding)):
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
93 raise ValueError("unrecognized encoding %s" % (encoding,))
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
94 self.filename = filename
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
95 self.mimetype = mimetype
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
96 self.encoding = encoding
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
97 self.included_archives = []
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
98 try:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
99 self.archive = open(filename, 'r')
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
100 except (IOError, OSError), error:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
101 raise ExtractorError("could not open %s: %s" %
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
102 (filename, error.strerror))
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
103 if encoding:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
104 self.pipe([self.decoders[encoding]], "decoding")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
105 self.prepare()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
106
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
107 def run(self, command, description="extraction", stdout=None, stderr=None,
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
108 stdin=None):
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
109 process = subprocess.Popen(command, stdin=stdin, stdout=stdout,
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
110 stderr=stderr)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
111 status = process.wait()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
112 if status != 0:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
113 raise ExtractorError("%s error: '%s' returned status code %s" %
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
114 (description, ' '.join(command), status))
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
115 for pipe in (process.stdout, process.stderr):
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
116 try:
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
117 pipe.close()
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
118 except AttributeError:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
119 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
120
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
121 def pipe(self, command, description, stderr=None):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
122 output = tempfile.TemporaryFile()
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
123 self.run(command, description, output, stderr, self.archive)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
124 self.archive.close()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
125 self.archive = output
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
126 self.archive.flush()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
127
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
128 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
129 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
130
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
131 def check_contents(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
132 self.archive.seek(0, 0)
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
133 archive_type = None
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
134 filenames = self.get_filenames()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
135 try:
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
136 filename = filenames.next()
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
137 if extractor_map.has_key(mimetypes.guess_type(filename)[0]):
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
138 self.included_archives.append(filename)
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
139 first_part = filename.split('/', 1)[0] + '/'
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
140 except StopIteration:
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
141 filenames.stop()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
142 return EMPTY
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
143 for filename in filenames:
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
144 if extractor_map.has_key(mimetypes.guess_type(filename)[0]):
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
145 self.included_archives.append(filename)
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
146 if (archive_type is None) and (not filename.startswith(first_part)):
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
147 archive_type = BOMB
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
148 filenames.stop()
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
149 if archive_type:
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
150 return archive_type
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
151 if self.basename() == first_part[:-1]:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
152 return MATCHING_DIRECTORY
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
153 return first_part
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
154
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
155 def basename(self):
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
156 pieces = os.path.basename(self.filename).split('.')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
157 extension = '.' + pieces[-1]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
158 if mimetypes.encodings_map.has_key(extension):
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
159 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
160 extension = '.' + pieces[-1]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
161 if (mimetypes.types_map.has_key(extension) or
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
162 mimetypes.common_types.has_key(extension) or
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
163 mimetypes.suffix_map.has_key(extension)):
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
164 pieces.pop()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
165 return '.'.join(pieces)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
166
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
167 def extract(self, path):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
168 self.archive.seek(0, 0)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
169 self.extract_archive()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
170
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
171
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
172 class TarExtractor(BaseExtractor):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
173 def get_filenames(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
174 return ProcessStreamer(['tar', '-t'], self.archive)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
175
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
176 def extract_archive(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
177 self.run(['tar', '-x'], stdin=self.archive)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
178
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
179
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
180 class ZipExtractor(BaseExtractor):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
181 def __init__(self, filename, mimetype, encoding):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
182 self.filename = filename
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
183 self.mimetype = mimetype
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
184 self.encoding = encoding
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
185 self.included_archives = []
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
186 self.archive = StringIO()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
187
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
188 def get_filenames(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
189 return ProcessStreamer(['zipinfo', '-1', self.filename], None)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
190
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
191 def extract(self, path):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
192 self.run(['unzip', '-q', os.path.join(path, self.filename)])
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
193
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
194
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
195 class CpioExtractor(BaseExtractor):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
196 def get_filenames(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
197 return ProcessStreamer(['cpio', '-t'], self.archive,
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
198 stderr=subprocess.PIPE)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
199
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
200 def extract_archive(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
201 self.run(['cpio', '-i', '--make-directories',
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
202 '--no-absolute-filenames'],
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
203 stderr=subprocess.PIPE, stdin=self.archive)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
204
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
205
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
206 class RPMExtractor(CpioExtractor):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
207 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
208 self.pipe(['rpm2cpio', '-'], "rpm2cpio")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
209
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
210 def basename(self):
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
211 pieces = os.path.basename(self.filename).rsplit('.', 2)
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
212 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
213 return pieces[0]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
214 elif pieces[-1] != 'rpm':
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
215 return BaseExtractor.basename(self)
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
216 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
217 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
218 return pieces[0]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
219 elif len(pieces[-1]) < 6:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
220 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
221 return '.'.join(pieces)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
222
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
223 def check_contents(self):
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
224 CpioExtractor.check_contents(self)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
225 return BOMB
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
226
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
227
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
228 class DebExtractor(TarExtractor):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
229 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
230 self.pipe(['ar', 'p', self.filename, 'data.tar.gz'],
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
231 "data.tar.gz extraction")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
232 self.archive.seek(0, 0)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
233 self.pipe(['zcat'], "data.tar.gz decompression")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
234
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
235 def basename(self):
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
236 pieces = os.path.basename(self.filename).rsplit('_', 1)
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
237 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
238 return pieces[0]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
239 elif (len(pieces[-1]) > 10) or (not pieces[-1].endswith('.deb')):
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
240 return BaseExtractor.basename(self)
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
241 return pieces[0]
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
242
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
243 def check_contents(self):
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
244 TarExtractor.check_contents(self)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
245 return BOMB
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
246
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
247
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
248 class MatchHandler(object):
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
249 def __init__(self, extractor, contents):
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
250 self.extractor = extractor
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
251 self.contents = contents
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
252 self.directory = extractor.basename()
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
253
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
254 def extract(self, directory='.'):
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
255 try:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
256 self.extractor.extract(directory)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
257 except ExtractorError, error:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
258 return error.strerror
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
259
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
260 def cleanup(self):
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
261 command = 'chmod'
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
262 status = subprocess.call(['chmod', '-R', 'u+rw', self.directory])
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
263 if status == 0:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
264 command = 'find'
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
265 status = subprocess.call(['find', self.directory, '-type', 'd',
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
266 '-exec', 'chmod', 'u+x', '{}', ';'])
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
267 if status != 0:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
268 return "%s returned with exit status %s" % (command, status)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
269
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
270
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
271 class BombHandler(MatchHandler):
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
272 def __init__(self, extractor, contents):
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
273 MatchHandler.__init__(self, extractor, contents)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
274 basename = self.directory
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
275 for suffix in [''] + ['.%s' % (x,) for x in range(1, 10)]:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
276 self.directory = '%s%s' % (basename, suffix)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
277 try:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
278 os.mkdir(self.directory)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
279 except OSError, error:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
280 if error.errno == errno.EEXIST:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
281 continue
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
282 raise ValueError("could not make extraction directory %s: %s" %
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
283 (error.filename, error.strerror))
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
284 ## if suffix != '':
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
285 ## self.show_error("extracted to %s" % (directory,))
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
286 break
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
287 else:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
288 raise ValueError("all good names for an extraction directory taken")
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
289
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
290 def extract(self):
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
291 os.chdir(self.directory)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
292 return MatchHandler.extract(self, '..')
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
293
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
294 def cleanup(self):
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
295 os.chdir('..')
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
296 return MatchHandler.cleanup(self)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
297
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
298
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
299 class EmptyHandler(object):
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
300 def __init__(self, extractor, contents): pass
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
301 def extract(self): pass
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
302 def cleanup(self): pass
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
303
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
304 extractor_map = {'application/x-tar': TarExtractor,
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
305 'application/zip': ZipExtractor,
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
306 'application/x-msdos-program': ZipExtractor,
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
307 'application/x-debian-package': DebExtractor,
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
308 'application/x-redhat-package-manager': RPMExtractor,
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
309 'application/x-rpm': RPMExtractor,
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
310 'application/x-cpio': CpioExtractor}
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
311
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
312 handler_map = {EMPTY: EmptyHandler,
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
313 MATCHING_DIRECTORY: MatchHandler}
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
314
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
315 class ExtractorApplication(object):
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
316 def __init__(self, arguments):
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
317 self.parse_options(arguments)
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
318 self.successes = []
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
319 self.failures = []
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
320
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
321 def parse_options(self, arguments):
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
322 parser = optparse.OptionParser(
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
323 usage="%prog [options] archive [archive2 ...]",
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
324 description="Intelligent archive extractor",
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
325 version=VERSION_BANNER
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
326 )
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
327 parser.add_option('-r', '--recursive', dest='recursive',
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
328 action='store_true', default=False,
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
329 help='extract archives contained in the ones listed')
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
330 self.options, filenames = parser.parse_args(arguments)
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
331 if not filenames:
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
332 parser.error("you did not list any archives")
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
333 self.archives = {os.path.realpath(os.curdir): filenames}
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
334
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
335 def show_error(self, message):
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
336 print >>sys.stderr, "%s: %s" % (self.current_filename, message)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
337
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
338 def get_extractor(self):
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
339 mimetype, encoding = mimetypes.guess_type(self.current_filename)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
340 try:
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
341 extractor = extractor_map[mimetype]
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
342 except KeyError:
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
343 return "not a known archive type"
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
344 try:
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
345 self.current_extractor = extractor(self.current_filename, mimetype,
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
346 encoding)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
347 content = self.current_extractor.check_contents()
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
348 handler = handler_map.get(content, BombHandler)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
349 self.current_handler = handler(self.current_extractor, content)
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
350 except ExtractorError, error:
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
351 return str(error)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
352
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
353 def recurse(self):
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
354 if not self.options.recursive:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
355 return
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
356 archive_path = os.path.split(self.current_filename)[0]
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
357 for filename in self.current_extractor.included_archives:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
358 tail_path, basename = os.path.split(filename)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
359 directory = os.path.join(self.current_directory, archive_path,
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
360 self.current_handler.directory, tail_path)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
361 self.archives.setdefault(directory, []).append(basename)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
362
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
363 def report(self, function, *args):
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
364 error = function(*args)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
365 if error:
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
366 self.show_error(error)
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
367 return False
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
368 return True
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
369
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
370 def run(self):
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
371 while self.archives:
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
372 self.current_directory, filenames = self.archives.popitem()
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
373 for filename in filenames:
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
374 os.chdir(self.current_directory)
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
375 self.current_filename = filename
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
376 self.cleanup_actions = []
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
377 success = self.report(self.get_extractor)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
378 if success:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
379 for name in 'extract', 'cleanup':
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
380 success = (self.report(getattr(self.current_handler,
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
381 name)) and success)
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
382 self.recurse()
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
383 if success:
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
384 self.successes.append(self.current_filename)
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
385 else:
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
386 self.failures.append(self.current_filename)
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
387 if self.failures:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
388 return 1
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
389 return 0
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
390
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
391
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
392 if __name__ == '__main__':
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
393 app = ExtractorApplication(sys.argv[1:])
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
394 sys.exit(app.run())

mercurial