scripts/dtrx

Wed, 06 Feb 2008 19:38:26 -0500

author
brett
date
Wed, 06 Feb 2008 19:38:26 -0500
branch
trunk
changeset 59
7a0aafe2fe87
parent 58
16506464d57b
child 62
17d845dacff5
permissions
-rwxr-xr-x

[svn] Find self-extracting archives by their file magic only, not extension/mimetype.

The problem with using extensions/mimetypes for this is that it net way too
many false positives. Files ending in .com, .bat, etc. would cause the
user to be prompted for recursive extraction. This makes that problem go
away, and it means that error messages when the user tries to extract a
non-archive .exe will probably be more useful, too.

1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1 #!/usr/bin/env python
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
2 #
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
3 # dtrx -- Intelligently extract various archive types.
54
cd43d2f61162 [svn] Update copyright dates in the license headers.
brett
parents: 53
diff changeset
4 # Copyright (c) 2006, 2007, 2008 Brett Smith <brettcsmith@brettcsmith.org>.
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
5 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
6 # This program is free software; you can redistribute it and/or modify it
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
7 # under the terms of the GNU General Public License as published by the
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
8 # Free Software Foundation; either version 3 of the License, or (at your
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
9 # option) any later version.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
10 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
11 # This program is distributed in the hope that it will be useful, but
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
12 # WITHOUT ANY WARRANTY; without even the implied warranty of
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
13 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
14 # Public License for more details.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
15 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
16 # You should have received a copy of the GNU General Public License along
42
4a4cab75d5e6 [svn] Update documentation.
brett
parents: 41
diff changeset
17 # with this program; if not, see <http://www.gnu.org/licenses/>.
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
18
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
19 import errno
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
20 import glob
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
21 import logging
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
22 import mimetypes
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
23 import optparse
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
24 import os
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
25 import re
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
26 import shutil
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
27 import signal
15
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
28 import stat
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
29 import subprocess
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
30 import sys
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
31 import tempfile
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
32 import textwrap
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
33 import traceback
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
34
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
35 from sets import Set as set
36
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
36
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
37 VERSION = "6.0"
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
38 VERSION_BANNER = """dtrx version %s
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
39 Copyright (c) 2006, 2007, 2008 Brett Smith <brettcsmith@brettcsmith.org>
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
40
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
41 This program is free software; you can redistribute it and/or modify it
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
42 under the terms of the GNU General Public License as published by the
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
43 Free Software Foundation; either version 3 of the License, or (at your
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
44 option) any later version.
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
45
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
46 This program is distributed in the hope that it will be useful, but
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
47 WITHOUT ANY WARRANTY; without even the implied warranty of
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
48 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
49 Public License for more details.""" % (VERSION,)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
50
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
51 MATCHING_DIRECTORY = 1
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
52 ONE_ENTRY = 2
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
53 BOMB = 3
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
54 EMPTY = 4
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
55 ONE_ENTRY_KNOWN = 5
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
56
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
57 EXTRACT_HERE = 1
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
58 EXTRACT_WRAP = 2
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
59 EXTRACT_RENAME = 3
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
60
23
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
61 RECURSE_ALWAYS = 1
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
62 RECURSE_ONCE = 2
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
63 RECURSE_NOT_NOW = 3
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
64 RECURSE_NEVER = 4
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
65 RECURSE_LIST = 5
23
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
66
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
67 mimetypes.encodings_map.setdefault('.bz2', 'bzip2')
34
a8f875e02c83 [svn] Add support for LZMA compression. Holy crap that was easy.
brett
parents: 33
diff changeset
68 mimetypes.encodings_map.setdefault('.lzma', 'lzma')
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
69 mimetypes.types_map.setdefault('.gem', 'application/x-ruby-gem')
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
70
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
71 logger = logging.getLogger('dtrx-log')
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
72
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
73 class FilenameChecker(object):
58
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
74 free_func = os.open
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
75 free_args = (os.O_CREAT | os.O_EXCL,)
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
76 free_close = os.close
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
77
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
78 def __init__(self, original_name):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
79 self.original_name = original_name
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
80
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
81 def is_free(self, filename):
58
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
82 try:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
83 result = self.free_func(filename, *self.free_args)
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
84 except OSError, error:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
85 if error.errno == errno.EEXIST:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
86 return False
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
87 raise
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
88 if self.free_close:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
89 self.free_close(result)
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
90 return True
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
91
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
92 def create(self):
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
93 fd, filename = tempfile.mkstemp(prefix=self.original_name + '.',
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
94 dir='.')
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
95 os.close(fd)
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
96 return filename
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
97
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
98 def check(self):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
99 for suffix in [''] + ['.%s' % (x,) for x in range(1, 10)]:
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
100 filename = '%s%s' % (self.original_name, suffix)
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
101 if self.is_free(filename):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
102 return filename
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
103 return self.create()
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
104
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
105
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
106 class DirectoryChecker(FilenameChecker):
58
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
107 free_func = os.mkdir
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
108 free_args = ()
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
109 free_close = None
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
110
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
111 def create(self):
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
112 return tempfile.mkdtemp(prefix=self.original_name + '.', dir='.')
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
113
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
114
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
115 class ExtractorError(Exception):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
116 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
117
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
118
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
119 class ExtractorUnusable(Exception):
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
120 pass
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
121
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
122
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
123 EXTRACTION_ERRORS = (ExtractorError, ExtractorUnusable, OSError, IOError)
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
124
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
125 class BaseExtractor(object):
34
a8f875e02c83 [svn] Add support for LZMA compression. Holy crap that was easy.
brett
parents: 33
diff changeset
126 decoders = {'bzip2': 'bzcat', 'gzip': 'zcat', 'compress': 'zcat',
a8f875e02c83 [svn] Add support for LZMA compression. Holy crap that was easy.
brett
parents: 33
diff changeset
127 'lzma': 'lzcat'}
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
128
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
129 name_checker = DirectoryChecker
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
130
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
131 def __init__(self, filename, encoding):
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
132 if encoding and (not self.decoders.has_key(encoding)):
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
133 raise ValueError("unrecognized encoding %s" % (encoding,))
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
134 self.filename = os.path.realpath(filename)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
135 self.encoding = encoding
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
136 self.included_archives = []
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
137 self.target = None
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
138 self.content_type = None
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
139 self.content_name = None
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
140 self.pipes = []
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
141 try:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
142 self.archive = open(filename, 'r')
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
143 except (IOError, OSError), error:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
144 raise ExtractorError("could not open %s: %s" %
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
145 (filename, error.strerror))
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
146 if encoding:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
147 self.pipe([self.decoders[encoding]], "decoding")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
148 self.prepare()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
149
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
150 def pipe(self, command, description="extraction"):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
151 self.pipes.append((command, description))
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
152
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
153 def run_pipes(self, final_stdout=None):
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
154 if not self.pipes:
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
155 return
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
156 elif final_stdout is None:
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
157 # FIXME: Buffering this might be dumb.
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
158 final_stdout = tempfile.TemporaryFile()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
159 num_pipes = len(self.pipes)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
160 last_pipe = num_pipes - 1
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
161 processes = []
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
162 for index, command in enumerate([pipe[0] for pipe in self.pipes]):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
163 if index == 0:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
164 stdin = self.archive
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
165 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
166 stdin = processes[-1].stdout
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
167 if index == last_pipe:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
168 stdout = final_stdout
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
169 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
170 stdout = subprocess.PIPE
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
171 try:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
172 processes.append(subprocess.Popen(command, stdin=stdin,
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
173 stdout=stdout,
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
174 stderr=subprocess.PIPE))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
175 except OSError, error:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
176 if error.errno == errno.ENOENT:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
177 raise ExtractorUnusable("could not run %s" % (command[0],))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
178 raise
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
179 exit_codes = [pipe.wait() for pipe in processes]
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
180 self.archive.close()
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
181 for index in range(last_pipe):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
182 processes[index].stdout.close()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
183 processes[index].stderr.close()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
184 for index, status in enumerate(exit_codes):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
185 if status != 0:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
186 raise ExtractorError("%s error: '%s' returned status code %s" %
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
187 (self.pipes[index][1],
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
188 ' '.join(self.pipes[index][0]), status))
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
189 self.archive = final_stdout
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
190
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
191 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
192 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
193
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
194 def check_included_archives(self):
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
195 if (self.content_name is None) or (not self.content_name.endswith('/')):
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
196 self.included_root = './'
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
197 else:
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
198 self.included_root = self.content_name
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
199 start_index = len(self.included_root)
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
200 for path, dirname, filenames in os.walk(self.included_root):
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
201 path = path[start_index:]
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
202 for filename in filenames:
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
203 if (ExtractorBuilder.try_by_mimetype(filename) or
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
204 ExtractorBuilder.try_by_extension(filename)):
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
205 self.included_archives.append(os.path.join(path, filename))
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
206
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
207 def check_contents(self):
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
208 self.contents = os.listdir('.')
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
209 if not self.contents:
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
210 self.content_type = EMPTY
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
211 elif len(self.contents) == 1:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
212 if self.basename() == self.contents[0]:
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
213 self.content_type = MATCHING_DIRECTORY
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
214 else:
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
215 self.content_type = ONE_ENTRY
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
216 self.content_name = self.contents[0]
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
217 if os.path.isdir(self.contents[0]):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
218 self.content_name += '/'
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
219 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
220 self.content_type = BOMB
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
221 self.check_included_archives()
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
222
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
223 def basename(self):
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
224 pieces = os.path.basename(self.filename).split('.')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
225 extension = '.' + pieces[-1]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
226 if mimetypes.encodings_map.has_key(extension):
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
227 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
228 extension = '.' + pieces[-1]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
229 if (mimetypes.types_map.has_key(extension) or
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
230 mimetypes.common_types.has_key(extension) or
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
231 mimetypes.suffix_map.has_key(extension)):
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
232 pieces.pop()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
233 return '.'.join(pieces)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
234
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
235 def extract(self):
40
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
236 try:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
237 self.target = tempfile.mkdtemp(prefix='.dtrx-', dir='.')
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
238 except (OSError, IOError), error:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
239 raise ExtractorError("cannot extract here: %s" % (error.strerror,))
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
240 old_path = os.path.realpath(os.curdir)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
241 os.chdir(self.target)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
242 try:
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
243 self.archive.seek(0, 0)
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
244 self.extract_archive()
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
245 self.check_contents()
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
246 except EXTRACTION_ERRORS:
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
247 self.archive.close()
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
248 os.chdir(old_path)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
249 shutil.rmtree(self.target, ignore_errors=True)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
250 raise
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
251 self.archive.close()
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
252 os.chdir(old_path)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
253
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
254 def get_filenames(self):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
255 self.run_pipes()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
256 self.archive.seek(0, 0)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
257 while True:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
258 line = self.archive.readline()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
259 if not line:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
260 self.archive.close()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
261 return
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
262 yield line.rstrip('\n')
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
263
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
264
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
265 class CompressionExtractor(BaseExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
266 file_type = 'compressed file'
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
267 name_checker = FilenameChecker
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
268
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
269 def basename(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
270 pieces = os.path.basename(self.filename).split('.')
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
271 extension = '.' + pieces[-1]
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
272 if mimetypes.encodings_map.has_key(extension):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
273 pieces.pop()
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
274 return '.'.join(pieces)
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
275
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
276 def get_filenames(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
277 yield self.basename()
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
278
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
279 def extract(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
280 self.content_type = ONE_ENTRY_KNOWN
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
281 self.content_name = self.basename()
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
282 self.contents = None
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
283 self.included_root = './'
40
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
284 try:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
285 output_fd, self.target = tempfile.mkstemp(prefix='.dtrx-', dir='.')
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
286 except (OSError, IOError), error:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
287 raise ExtractorError("cannot extract here: %s" % (error.strerror,))
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
288 try:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
289 self.run_pipes(output_fd)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
290 except EXTRACTION_ERRORS:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
291 os.close(output_fd)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
292 os.unlink(self.target)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
293 raise
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
294 os.close(output_fd)
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
295
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
296
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
297 class TarExtractor(BaseExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
298 file_type = 'tar file'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
299
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
300 def get_filenames(self):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
301 self.pipe(['tar', '-t'], "listing")
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
302 return BaseExtractor.get_filenames(self)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
303
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
304 def extract_archive(self):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
305 self.pipe(['tar', '-x'])
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
306 self.run_pipes()
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
307
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
308
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
309 class CpioExtractor(BaseExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
310 file_type = 'cpio file'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
311
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
312 def get_filenames(self):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
313 self.pipe(['cpio', '-t'], "listing")
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
314 return BaseExtractor.get_filenames(self)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
315
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
316 def extract_archive(self):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
317 self.pipe(['cpio', '-i', '--make-directories',
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
318 '--no-absolute-filenames'])
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
319 self.run_pipes()
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
320
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
321
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
322 class RPMExtractor(CpioExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
323 file_type = 'RPM'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
324
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
325 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
326 self.pipe(['rpm2cpio', '-'], "rpm2cpio")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
327
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
328 def basename(self):
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
329 pieces = os.path.basename(self.filename).split('.')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
330 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
331 return pieces[0]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
332 elif pieces[-1] != 'rpm':
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
333 return BaseExtractor.basename(self)
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
334 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
335 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
336 return pieces[0]
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
337 elif len(pieces[-1]) < 8:
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
338 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
339 return '.'.join(pieces)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
340
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
341 def check_contents(self):
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
342 self.check_included_archives()
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
343 self.content_type = BOMB
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
344
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
345
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
346 class DebExtractor(TarExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
347 file_type = 'Debian package'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
348
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
349 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
350 self.pipe(['ar', 'p', self.filename, 'data.tar.gz'],
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
351 "data.tar.gz extraction")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
352 self.pipe(['zcat'], "data.tar.gz decompression")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
353
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
354 def basename(self):
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
355 pieces = os.path.basename(self.filename).split('_')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
356 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
357 return pieces[0]
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
358 last_piece = pieces.pop()
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
359 if (len(last_piece) > 10) or (not last_piece.endswith('.deb')):
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
360 return BaseExtractor.basename(self)
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
361 return '_'.join(pieces)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
362
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
363 def check_contents(self):
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
364 self.check_included_archives()
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
365 self.content_type = BOMB
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
366
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
367
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
368 class DebMetadataExtractor(DebExtractor):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
369 def prepare(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
370 self.pipe(['ar', 'p', self.filename, 'control.tar.gz'],
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
371 "control.tar.gz extraction")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
372 self.pipe(['zcat'], "control.tar.gz decompression")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
373
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
374
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
375 class GemExtractor(TarExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
376 file_type = 'Ruby gem'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
377
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
378 def prepare(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
379 self.pipe(['tar', '-xO', 'data.tar.gz'], "data.tar.gz extraction")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
380 self.pipe(['zcat'], "data.tar.gz decompression")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
381
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
382 def check_contents(self):
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
383 self.check_included_archives()
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
384 self.content_type = BOMB
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
385
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
386
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
387 class GemMetadataExtractor(CompressionExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
388 file_type = 'Ruby gem'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
389
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
390 def prepare(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
391 self.pipe(['tar', '-xO', 'metadata.gz'], "metadata.gz extraction")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
392 self.pipe(['zcat'], "metadata.gz decompression")
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
393
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
394 def basename(self):
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
395 return os.path.basename(self.filename) + '-metadata.txt'
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
396
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
397
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
398 class NoPipeExtractor(BaseExtractor):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
399 # Some extraction tools won't accept the archive from stdin. With
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
400 # these, the piping infrastructure we normally set up generally doesn't
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
401 # work, at least at first. We can still use most of it; we just don't
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
402 # want to seed self.archive with the archive file, since that sucks up
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
403 # memory. So instead we seed it with /dev/null, and specify the
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
404 # filename on the command line as necessary. We also open the actual
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
405 # file with os.open, to make sure we can actually do it (permissions
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
406 # are good, etc.). This class doesn't do anything by itself; it's just
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
407 # meant to be a base class for extractors that rely on these dumb
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
408 # tools.
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
409 def __init__(self, filename, encoding):
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
410 os.close(os.open(filename, os.O_RDONLY))
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
411 BaseExtractor.__init__(self, '/dev/null', None)
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
412 self.filename = os.path.realpath(filename)
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
413
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
414
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
415 class ZipExtractor(NoPipeExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
416 file_type = 'Zip file'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
417
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
418 def get_filenames(self):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
419 self.pipe(['zipinfo', '-1', self.filename], "listing")
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
420 return BaseExtractor.get_filenames(self)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
421
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
422 def extract_archive(self):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
423 self.pipe(['unzip', '-q', self.filename])
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
424 self.run_pipes()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
425
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
426
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
427 class SevenExtractor(NoPipeExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
428 file_type = '7z file'
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
429 border_re = re.compile('^[- ]+$')
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
430
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
431 def get_filenames(self):
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
432 self.pipe(['7z', 'l', self.filename], "listing")
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
433 self.run_pipes()
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
434 self.archive.seek(0, 0)
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
435 fn_index = None
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
436 for line in self.archive:
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
437 if self.border_re.match(line):
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
438 if fn_index is not None:
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
439 break
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
440 else:
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
441 fn_index = line.rindex(' ') + 1
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
442 elif fn_index is not None:
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
443 yield line[fn_index:-1]
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
444 self.archive.close()
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
445
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
446 def extract_archive(self):
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
447 self.pipe(['7z', 'x', self.filename])
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
448 self.run_pipes()
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
449
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
450
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
451 class CABExtractor(NoPipeExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
452 file_type = 'CAB archive'
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
453 border_re = re.compile(r'^[-\+]+$')
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
454
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
455 def get_filenames(self):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
456 self.pipe(['cabextract', '-l', self.filename], "listing")
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
457 self.run_pipes()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
458 self.archive.seek(0, 0)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
459 fn_index = None
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
460 for line in self.archive:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
461 if self.border_re.match(line):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
462 break
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
463 for line in self.archive:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
464 try:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
465 yield line.split(' | ', 2)[2].rstrip('\n')
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
466 except IndexError:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
467 break
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
468 self.archive.close()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
469
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
470 def extract_archive(self):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
471 self.pipe(['cabextract', '-q', self.filename])
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
472 self.run_pipes()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
473
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
474
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
475 class BaseHandler(object):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
476 def __init__(self, extractor, options):
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
477 self.extractor = extractor
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
478 self.options = options
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
479 self.target = None
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
480
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
481 def handle(self):
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
482 command = 'find'
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
483 status = subprocess.call(['find', self.extractor.target, '-type', 'd',
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
484 '-exec', 'chmod', 'u+rwx', '{}', ';'])
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
485 if status == 0:
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
486 command = 'chmod'
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
487 status = subprocess.call(['chmod', '-R', 'u+rwX',
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
488 self.extractor.target])
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
489 if status != 0:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
490 return "%s returned with exit status %s" % (command, status)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
491 return self.organize()
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
492
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
493 def set_target(self, target, checker):
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
494 self.target = checker(target).check()
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
495 if self.target != target:
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
496 logger.warning("extracting %s to %s" %
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
497 (self.extractor.filename, self.target))
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
498
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
499
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
500 # The "where to extract" table, with options and archive types.
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
501 # This dictates the contents of each can_handle method.
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
502 #
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
503 # Flat Overwrite None
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
504 # File basename basename FilenameChecked
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
505 # Match . . tempdir + checked
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
506 # Bomb . basename DirectoryChecked
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
507
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
508 class FlatHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
509 def can_handle(contents, options):
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
510 return ((options.flat and (contents != ONE_ENTRY_KNOWN)) or
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
511 (options.overwrite and (contents == MATCHING_DIRECTORY)))
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
512 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
513
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
514 def organize(self):
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
515 self.target = '.'
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
516 for curdir, dirs, filenames in os.walk(self.extractor.target,
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
517 topdown=False):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
518 path_parts = curdir.split(os.sep)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
519 if path_parts[0] == '.':
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
520 del path_parts[1]
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
521 else:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
522 del path_parts[0]
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
523 newdir = os.path.join(*path_parts)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
524 if not os.path.isdir(newdir):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
525 os.makedirs(newdir)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
526 for filename in filenames:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
527 os.rename(os.path.join(curdir, filename),
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
528 os.path.join(newdir, filename))
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
529 os.rmdir(curdir)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
530
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
531
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
532 class OverwriteHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
533 def can_handle(contents, options):
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
534 return ((options.flat and (contents == ONE_ENTRY_KNOWN)) or
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
535 (options.overwrite and (contents != MATCHING_DIRECTORY)))
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
536 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
537
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
538 def organize(self):
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
539 self.target = self.extractor.basename()
51
f1789e6586d8 [svn] Don't try to rmtree when overwriting just a file.
brett
parents: 49
diff changeset
540 if os.path.isdir(self.target):
f1789e6586d8 [svn] Don't try to rmtree when overwriting just a file.
brett
parents: 49
diff changeset
541 shutil.rmtree(self.target)
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
542 os.rename(self.extractor.target, self.target)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
543
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
544
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
545 class MatchHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
546 def can_handle(contents, options):
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
547 return ((contents == MATCHING_DIRECTORY) or
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
548 ((contents == ONE_ENTRY) and
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
549 options.one_entry_policy.ok_for_match()))
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
550 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
551
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
552 def organize(self):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
553 source = os.path.join(self.extractor.target,
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
554 os.listdir(self.extractor.target)[0])
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
555 if os.path.isdir(source):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
556 checker = DirectoryChecker
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
557 else:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
558 checker = FilenameChecker
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
559 if self.options.one_entry_policy == EXTRACT_HERE:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
560 destination = self.extractor.content_name.rstrip('/')
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
561 else:
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
562 destination = self.extractor.basename()
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
563 self.set_target(destination, checker)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
564 if os.path.isdir(self.extractor.target):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
565 os.rename(source, self.target)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
566 os.rmdir(self.extractor.target)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
567 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
568 os.rename(self.extractor.target, self.target)
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
569 self.extractor.included_root = './'
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
570
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
571
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
572 class EmptyHandler(object):
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
573 def can_handle(contents, options):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
574 return contents == EMPTY
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
575 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
576
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
577 def __init__(self, extractor, options): pass
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
578 def handle(self): pass
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
579
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
580
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
581 class BombHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
582 def can_handle(contents, options):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
583 return True
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
584 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
585
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
586 def organize(self):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
587 basename = self.extractor.basename()
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
588 self.set_target(basename, self.extractor.name_checker)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
589 os.rename(self.extractor.target, self.target)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
590
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
591
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
592 class BasePolicy(object):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
593 def __init__(self, options):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
594 self.current_policy = None
26
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
595 if options.batch:
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
596 self.permanent_policy = self.answers['']
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
597 else:
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
598 self.permanent_policy = None
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
599
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
600 def ask_question(self, question):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
601 question = textwrap.wrap(question) + self.choices
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
602 while True:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
603 print "\n".join(question)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
604 try:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
605 answer = raw_input(self.prompt)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
606 except EOFError:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
607 return self.answers['']
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
608 try:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
609 return self.answers[answer.lower()]
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
610 except KeyError:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
611 print
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
612
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
613 def __cmp__(self, other):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
614 return cmp(self.current_policy, other)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
615
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
616
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
617 class OneEntryPolicy(BasePolicy):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
618 answers = {'h': EXTRACT_HERE, 'i': EXTRACT_WRAP, 'r': EXTRACT_RENAME,
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
619 '': EXTRACT_WRAP}
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
620 choices = ["You can:",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
621 " * extract it Inside another directory",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
622 " * extract it and Rename the directory",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
623 " * extract it Here"]
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
624 prompt = "What do you want to do? (I/r/h) "
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
625
26
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
626 def prep(self, archive_filename, entry_name):
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
627 question = ("%s contains one entry: %s." %
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
628 (archive_filename, entry_name))
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
629 self.current_policy = (self.permanent_policy or
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
630 self.ask_question(question))
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
631
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
632 def ok_for_match(self):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
633 return self.current_policy in (EXTRACT_RENAME, EXTRACT_HERE)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
634
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
635
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
636 class RecursionPolicy(BasePolicy):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
637 answers = {'o': RECURSE_ONCE, 'a': RECURSE_ALWAYS, 'n': RECURSE_NOT_NOW,
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
638 'v': RECURSE_NEVER, 'l': RECURSE_LIST, '': RECURSE_NOT_NOW}
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
639 choices = ["You can:",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
640 " * Always extract included archives",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
641 " * extract included archives this Once",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
642 " * choose Not to extract included archives",
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
643 " * neVer extract included archives",
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
644 " * List included archives"]
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
645 prompt = "What do you want to do? (a/o/N/v/l) "
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
646
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
647 def __init__(self, options):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
648 BasePolicy.__init__(self, options)
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
649 if options.show_list:
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
650 self.permanent_policy = RECURSE_NEVER
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
651 elif options.recursive:
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
652 self.permanent_policy = RECURSE_ALWAYS
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
653
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
654 def prep(self, current_filename, target, extractor):
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
655 archive_count = len(extractor.included_archives)
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
656 if (self.permanent_policy is not None) or (archive_count == 0):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
657 self.current_policy = self.permanent_policy or RECURSE_NOT_NOW
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
658 return
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
659 elif archive_count > 1:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
660 question = ("%s contains %s other archive files." %
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
661 (current_filename, archive_count))
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
662 else:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
663 question = ("%s contains another archive: %s." %
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
664 (current_filename, extractor.included_archives[0]))
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
665 if target == '.':
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
666 target = ''
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
667 included_root = extractor.included_root
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
668 if included_root == './':
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
669 included_root = ''
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
670 while True:
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
671 self.current_policy = self.ask_question(question)
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
672 if self.current_policy != RECURSE_LIST:
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
673 break
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
674 print ("\n%s\n" %
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
675 '\n'.join([os.path.join(target, included_root, filename)
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
676 for filename in extractor.included_archives]))
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
677 if self.current_policy in (RECURSE_ALWAYS, RECURSE_NEVER):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
678 self.permanent_policy = self.current_policy
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
679
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
680 def ok_to_recurse(self):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
681 return self.current_policy in (RECURSE_ALWAYS, RECURSE_ONCE)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
682
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
683
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
684 class ExtractorBuilder(object):
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
685 extractor_map = {'tar': (TarExtractor, None),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
686 'zip': (ZipExtractor, None),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
687 'deb': (DebExtractor, DebMetadataExtractor),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
688 'rpm': (RPMExtractor, None),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
689 'cpio': (CpioExtractor, None),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
690 'gem': (GemExtractor, GemMetadataExtractor),
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
691 'compress': (CompressionExtractor, None),
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
692 '7z': (SevenExtractor, None),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
693 'cab': (CABExtractor, None)}
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
694
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
695 mimetype_map = {}
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
696 for mapping in (('tar', 'x-tar'),
59
7a0aafe2fe87 [svn] Find self-extracting archives by their file magic only, not extension/mimetype.
brett
parents: 58
diff changeset
697 ('zip', 'zip'),
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
698 ('deb', 'x-debian-package'),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
699 ('rpm', 'x-redhat-package-manager', 'x-rpm'),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
700 ('cpio', 'x-cpio'),
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
701 ('gem', 'x-ruby-gem'),
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
702 ('7z', 'x-7z-compressed'),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
703 ('cab', 'x-cab')):
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
704 for mimetype in mapping[1:]:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
705 if '/' not in mimetype:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
706 mimetype = 'application/' + mimetype
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
707 mimetype_map[mimetype] = mapping[0]
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
708
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
709 magic_mime_map = {}
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
710 for mapping in (('deb', 'Debian binary package'),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
711 ('cpio', 'cpio archive'),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
712 ('tar', 'POSIX tar archive'),
59
7a0aafe2fe87 [svn] Find self-extracting archives by their file magic only, not extension/mimetype.
brett
parents: 58
diff changeset
713 ('zip', '(Zip|ZIP self-extracting) archive'),
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
714 ('rpm', 'RPM'),
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
715 ('7z', '7-zip archive'),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
716 ('cab', 'Microsoft Cabinet archive')):
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
717 for pattern in mapping[1:]:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
718 magic_mime_map[re.compile(pattern)] = mapping[0]
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
719
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
720 magic_encoding_map = {}
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
721 for mapping in (('bzip2', 'bzip2 compressed'),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
722 ('gzip', 'gzip compressed')):
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
723 for pattern in mapping[1:]:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
724 magic_encoding_map[re.compile(pattern)] = mapping[0]
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
725
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
726 extension_map = {}
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
727 for mapping in (('tar', 'bzip2', 'tar.bz2'),
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
728 ('tar', 'gzip', 'tar.gz', 'tgz'),
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
729 ('tar', None, 'tar'),
59
7a0aafe2fe87 [svn] Find self-extracting archives by their file magic only, not extension/mimetype.
brett
parents: 58
diff changeset
730 ('zip', None, 'zip'),
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
731 ('deb', None, 'deb'),
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
732 ('rpm', None, 'rpm'),
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
733 ('cpio', None, 'cpio'),
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
734 ('gem', None, 'gem'),
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
735 ('compress', 'gzip', 'Z', 'gz'),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
736 ('compress', 'bzip2', 'bz2'),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
737 ('compress', 'lzma', 'lzma'),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
738 ('7z', None, '7z'),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
739 ('cab', None, 'cab', 'exe')):
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
740 for extension in mapping[2:]:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
741 extension_map.setdefault(extension, []).append(mapping[:2])
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
742
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
743 def __init__(self, filename, options):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
744 self.filename = filename
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
745 self.options = options
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
746
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
747 def build_extractor(self, archive_type, encoding):
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
748 extractors = self.extractor_map[archive_type]
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
749 if self.options.metadata and (extractors[1] is not None):
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
750 extractor = extractors[1]
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
751 else:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
752 extractor = extractors[0]
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
753 return extractor(self.filename, encoding)
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
754
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
755 def get_extractor(self):
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
756 tried_types = set()
36
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
757 # As smart as it is, the magic test can't go first, because at least
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
758 # on my system it just recognizes gem files as tar files. I guess
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
759 # it's possible for the opposite problem to occur -- where the mimetype
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
760 # or extension suggests something less than ideal -- but it seems less
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
761 # likely so I'm sticking with this.
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
762 for func_name in ('mimetype', 'extension', 'magic'):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
763 logger.debug("getting extractors by %s" % (func_name,))
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
764 extractor_types = \
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
765 getattr(self, 'try_by_' + func_name)(self.filename)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
766 logger.debug("done getting extractors")
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
767 for ext_args in extractor_types:
36
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
768 if ext_args in tried_types:
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
769 continue
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
770 tried_types.add(ext_args)
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
771 logger.debug("trying %s extractor from %s" %
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
772 (ext_args, func_name))
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
773 yield self.build_extractor(*ext_args)
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
774
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
775 def try_by_mimetype(cls, filename):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
776 mimetype, encoding = mimetypes.guess_type(filename)
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
777 try:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
778 return [(cls.mimetype_map[mimetype], encoding)]
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
779 except KeyError:
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
780 if encoding:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
781 return [('compress', encoding)]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
782 return []
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
783 try_by_mimetype = classmethod(try_by_mimetype)
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
784
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
785 def magic_map_matches(cls, output, magic_map):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
786 return [result for regexp, result in magic_map.items()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
787 if regexp.search(output)]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
788 magic_map_matches = classmethod(magic_map_matches)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
789
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
790 def try_by_magic(cls, filename):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
791 process = subprocess.Popen(['file', '-z', filename],
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
792 stdout=subprocess.PIPE)
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
793 status = process.wait()
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
794 if status != 0:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
795 return []
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
796 output = process.stdout.readline()
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
797 process.stdout.close()
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
798 if output.startswith('%s: ' % filename):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
799 output = output[len(filename) + 2:]
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
800 mimes = cls.magic_map_matches(output, cls.magic_mime_map)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
801 encodings = cls.magic_map_matches(output, cls.magic_encoding_map)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
802 if mimes and not encodings:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
803 encodings = [None]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
804 elif encodings and not mimes:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
805 mimes = ['compress']
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
806 return [(m, e) for m in mimes for e in encodings]
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
807 try_by_magic = classmethod(try_by_magic)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
808
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
809 def try_by_extension(cls, filename):
43
4591a32eedc8 [svn] Sadly Python 2.3 does not have an rsplit method on strings.
brett
parents: 42
diff changeset
810 parts = filename.split('.')[-2:]
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
811 results = []
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
812 while parts:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
813 results.extend(cls.extension_map.get('.'.join(parts), []))
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
814 del parts[0]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
815 return results
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
816 try_by_extension = classmethod(try_by_extension)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
817
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
818
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
819 class BaseAction(object):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
820 def __init__(self, options, filenames):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
821 self.options = options
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
822 self.filenames = filenames
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
823 self.target = None
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
824
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
825 def report(self, function, *args):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
826 try:
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
827 error = function(*args)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
828 except EXTRACTION_ERRORS, exception:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
829 error = str(exception)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
830 logger.debug(''.join(traceback.format_exception(*sys.exc_info())))
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
831 return error
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
832
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
833
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
834 class ExtractionAction(BaseAction):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
835 handlers = [FlatHandler, OverwriteHandler, MatchHandler, EmptyHandler,
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
836 BombHandler]
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
837
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
838 def __init__(self, options, filenames):
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
839 BaseAction.__init__(self, options, filenames)
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
840 self.did_print = False
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
841
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
842 def get_handler(self, extractor):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
843 if extractor.content_type == ONE_ENTRY:
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
844 self.options.one_entry_policy.prep(self.current_filename,
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
845 extractor.content_name)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
846 for handler in self.handlers:
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
847 if handler.can_handle(extractor.content_type, self.options):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
848 logger.debug("using %s handler" % (handler.__name__,))
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
849 self.current_handler = handler(extractor, self.options)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
850 break
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
851
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
852 def show_extraction(self, extractor):
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
853 if self.options.log_level > logging.INFO:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
854 return
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
855 elif self.did_print:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
856 print
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
857 else:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
858 self.did_print = True
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
859 print "%s:" % (self.current_filename,)
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
860 if extractor.contents is None:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
861 print self.current_handler.target
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
862 return
55
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
863 def reverser(x, y):
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
864 return cmp(y, x)
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
865 if self.current_handler.target == '.':
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
866 filenames = extractor.contents
55
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
867 filenames.sort(reverser)
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
868 else:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
869 filenames = [self.current_handler.target]
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
870 pathjoin = os.path.join
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
871 isdir = os.path.isdir
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
872 while filenames:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
873 filename = filenames.pop()
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
874 if isdir(filename):
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
875 print "%s/" % (filename,)
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
876 new_filenames = os.listdir(filename)
55
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
877 new_filenames.sort(reverser)
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
878 filenames.extend([pathjoin(filename, new_filename)
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
879 for new_filename in new_filenames])
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
880 else:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
881 print filename
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
882
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
883 def run(self, filename, extractor):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
884 self.current_filename = filename
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
885 error = (self.report(extractor.extract) or
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
886 self.report(self.get_handler, extractor) or
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
887 self.report(self.current_handler.handle) or
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
888 self.report(self.show_extraction, extractor))
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
889 if not error:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
890 self.target = self.current_handler.target
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
891 return error
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
892
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
893
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
894 class ListAction(BaseAction):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
895 def __init__(self, options, filenames):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
896 BaseAction.__init__(self, options, filenames)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
897 self.count = 0
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
898
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
899 def get_list(self, extractor):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
900 # Note: The reason I'm getting all the filenames up front is
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
901 # because if we run into trouble partway through the archive, we'll
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
902 # try another extractor. So before we display anything we have to
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
903 # be sure this one is successful. We maybe don't have to be quite
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
904 # this conservative but this is the easy way out for now.
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
905 self.filelist = list(extractor.get_filenames())
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
906
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
907 def show_list(self, filename):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
908 self.count += 1
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
909 if len(self.filenames) != 1:
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
910 if self.count > 1:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
911 print
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
912 print "%s:" % (filename,)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
913 print '\n'.join(self.filelist)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
914
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
915 def run(self, filename, extractor):
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
916 return (self.report(self.get_list, extractor) or
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
917 self.report(self.show_list, filename))
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
918
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
919
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
920 class ExtractorApplication(object):
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
921 def __init__(self, arguments):
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
922 for signal_num in (signal.SIGINT, signal.SIGTERM):
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
923 signal.signal(signal_num, self.abort)
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
924 self.parse_options(arguments)
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
925 self.setup_logger()
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
926 self.successes = []
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
927 self.failures = []
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
928
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
929 def abort(self, signal_num, frame):
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
930 signal.signal(signal_num, signal.SIG_IGN)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
931 print
49
c76dd2716113 [svn] Add a traceback when we catch a signal.
brett
parents: 48
diff changeset
932 logger.debug("traceback:\n" +
c76dd2716113 [svn] Add a traceback when we catch a signal.
brett
parents: 48
diff changeset
933 ''.join(traceback.format_stack(frame)).rstrip())
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
934 logger.debug("got signal %s; cleaning up" % (signal_num,))
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
935 clean_targets = set([os.path.realpath('.')])
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
936 if hasattr(self, 'current_directory'):
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
937 clean_targets.add(os.path.realpath(self.current_directory))
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
938 for directory in clean_targets:
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
939 os.chdir(directory)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
940 for path in glob.glob('.dtrx-*'):
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
941 try:
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
942 os.unlink(path)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
943 except OSError, error:
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
944 if error.errno == errno.EISDIR:
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
945 shutil.rmtree(path, ignore_errors=True)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
946 sys.exit(1)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
947
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
948 def parse_options(self, arguments):
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
949 parser = optparse.OptionParser(
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
950 usage="%prog [options] archive [archive2 ...]",
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
951 description="Intelligent archive extractor",
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
952 version=VERSION_BANNER
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
953 )
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
954 parser.add_option('-r', '--recursive', dest='recursive',
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
955 action='store_true', default=False,
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
956 help='extract archives contained in the ones listed')
13
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
957 parser.add_option('-q', '--quiet', dest='quiet',
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
958 action='count', default=3,
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
959 help='suppress warning/error messages')
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
960 parser.add_option('-v', '--verbose', dest='verbose',
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
961 action='count', default=0,
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
962 help='be verbose/print debugging information')
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
963 parser.add_option('-o', '--overwrite', dest='overwrite',
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
964 action='store_true', default=False,
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
965 help='overwrite any existing target directory')
15
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
966 parser.add_option('-f', '--flat', '--no-directory', dest='flat',
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
967 action='store_true', default=False,
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
968 help="don't put contents in their own directory")
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
969 parser.add_option('-l', '-t', '--list', '--table', dest='show_list',
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
970 action='store_true', default=False,
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
971 help="list contents of archives on standard output")
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
972 parser.add_option('-n', '--noninteractive', dest='batch',
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
973 action='store_true', default=False,
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
974 help="don't ask how to handle special cases")
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
975 parser.add_option('-m', '--metadata', dest='metadata',
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
976 action='store_true', default=False,
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
977 help="extract metadata from a .deb/.gem")
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
978 self.options, filenames = parser.parse_args(arguments)
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
979 if not filenames:
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
980 parser.error("you did not list any archives")
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
981 # This makes WARNING is the default.
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
982 self.options.log_level = (10 * (self.options.quiet -
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
983 self.options.verbose))
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
984 self.options.one_entry_policy = OneEntryPolicy(self.options)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
985 self.options.recursion_policy = RecursionPolicy(self.options)
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
986 self.archives = {os.path.realpath(os.curdir): filenames}
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
987
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
988 def setup_logger(self):
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
989 logging.getLogger().setLevel(self.options.log_level)
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
990 handler = logging.StreamHandler()
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
991 handler.setLevel(self.options.log_level)
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
992 formatter = logging.Formatter("dtrx: %(levelname)s: %(message)s")
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
993 handler.setFormatter(formatter)
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
994 logger.addHandler(handler)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
995 logger.debug("logger is set up")
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
996
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
997 def recurse(self, filename, extractor, action):
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
998 self.options.recursion_policy.prep(filename, action.target, extractor)
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
999 if self.options.recursion_policy.ok_to_recurse():
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
1000 for filename in extractor.included_archives:
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
1001 tail_path, basename = os.path.split(filename)
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
1002 directory = os.path.join(self.current_directory, action.target,
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
1003 extractor.included_root, tail_path)
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
1004 self.archives.setdefault(directory, []).append(basename)
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
1005
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1006 def check_file(self, filename):
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1007 try:
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1008 result = os.stat(filename)
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1009 except OSError, error:
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1010 return error.strerror
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1011 if stat.S_ISDIR(result.st_mode):
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1012 return "cannot extract a directory"
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1013
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1014 def try_extractors(self, filename, builder):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1015 errors = []
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1016 for extractor in builder:
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1017 error = self.action.run(filename, extractor)
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1018 if error:
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1019 errors.append((extractor.file_type, extractor.encoding, error))
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1020 else:
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1021 self.recurse(filename, extractor, self.action)
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1022 return
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1023 logger.error("could not handle %s" % (filename,))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1024 if not errors:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1025 logger.error("not a known archive type")
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1026 return True
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1027 for file_type, encoding, error in errors:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1028 message = ["treating as", file_type, "failed:", error]
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1029 if encoding:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1030 message.insert(1, "%s-encoded" % (encoding,))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1031 logger.error(' '.join(message))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1032 return True
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1033
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1034 def run(self):
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1035 if self.options.show_list:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1036 action = ListAction
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1037 else:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1038 action = ExtractionAction
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1039 self.action = action(self.options, self.archives.values()[0])
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1040 while self.archives:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1041 self.current_directory, self.filenames = self.archives.popitem()
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1042 os.chdir(self.current_directory)
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1043 for filename in self.filenames:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1044 builder = ExtractorBuilder(filename, self.options)
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1045 error = (self.check_file(filename) or
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1046 self.try_extractors(filename, builder.get_extractor()))
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1047 if error:
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1048 if error != True:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1049 logger.error("%s: %s" % (filename, error))
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1050 self.failures.append(filename)
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1051 else:
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1052 self.successes.append(filename)
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1053 self.options.one_entry_policy.permanent_policy = EXTRACT_WRAP
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1054 if self.failures:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1055 return 1
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1056 return 0
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1057
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1058
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1059 if __name__ == '__main__':
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1060 app = ExtractorApplication(sys.argv[1:])
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1061 sys.exit(app.run())

mercurial