scripts/dtrx

Wed, 30 Apr 2008 21:33:30 -0400

author
Brett Smith <brett@brettcsmith.org>
date
Wed, 30 Apr 2008 21:33:30 -0400
branch
trunk
changeset 60
6d7246a56b00
parent 59
7a0aafe2fe87
child 62
17d845dacff5
permissions
-rwxr-xr-x

Add missing test tarballs

1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1 #!/usr/bin/env python
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
2 #
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
3 # dtrx -- Intelligently extract various archive types.
54
cd43d2f61162 [svn] Update copyright dates in the license headers.
brett
parents: 53
diff changeset
4 # Copyright (c) 2006, 2007, 2008 Brett Smith <brettcsmith@brettcsmith.org>.
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
5 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
6 # This program is free software; you can redistribute it and/or modify it
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
7 # under the terms of the GNU General Public License as published by the
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
8 # Free Software Foundation; either version 3 of the License, or (at your
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
9 # option) any later version.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
10 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
11 # This program is distributed in the hope that it will be useful, but
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
12 # WITHOUT ANY WARRANTY; without even the implied warranty of
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
13 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
14 # Public License for more details.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
15 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
16 # You should have received a copy of the GNU General Public License along
42
4a4cab75d5e6 [svn] Update documentation.
brett
parents: 41
diff changeset
17 # with this program; if not, see <http://www.gnu.org/licenses/>.
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
18
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
19 import errno
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
20 import glob
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
21 import logging
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
22 import mimetypes
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
23 import optparse
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
24 import os
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
25 import re
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
26 import shutil
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
27 import signal
15
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
28 import stat
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
29 import subprocess
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
30 import sys
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
31 import tempfile
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
32 import textwrap
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
33 import traceback
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
34
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
35 from sets import Set as set
36
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
36
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
37 VERSION = "6.0"
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
38 VERSION_BANNER = """dtrx version %s
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
39 Copyright (c) 2006, 2007, 2008 Brett Smith <brettcsmith@brettcsmith.org>
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
40
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
41 This program is free software; you can redistribute it and/or modify it
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
42 under the terms of the GNU General Public License as published by the
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
43 Free Software Foundation; either version 3 of the License, or (at your
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
44 option) any later version.
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
45
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
46 This program is distributed in the hope that it will be useful, but
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
47 WITHOUT ANY WARRANTY; without even the implied warranty of
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
48 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
49 Public License for more details.""" % (VERSION,)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
50
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
51 MATCHING_DIRECTORY = 1
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
52 ONE_ENTRY = 2
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
53 BOMB = 3
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
54 EMPTY = 4
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
55 ONE_ENTRY_KNOWN = 5
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
56
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
57 EXTRACT_HERE = 1
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
58 EXTRACT_WRAP = 2
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
59 EXTRACT_RENAME = 3
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
60
23
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
61 RECURSE_ALWAYS = 1
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
62 RECURSE_ONCE = 2
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
63 RECURSE_NOT_NOW = 3
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
64 RECURSE_NEVER = 4
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
65 RECURSE_LIST = 5
23
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
66
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
67 mimetypes.encodings_map.setdefault('.bz2', 'bzip2')
34
a8f875e02c83 [svn] Add support for LZMA compression. Holy crap that was easy.
brett
parents: 33
diff changeset
68 mimetypes.encodings_map.setdefault('.lzma', 'lzma')
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
69 mimetypes.types_map.setdefault('.gem', 'application/x-ruby-gem')
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
70
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
71 logger = logging.getLogger('dtrx-log')
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
72
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
73 class FilenameChecker(object):
58
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
74 free_func = os.open
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
75 free_args = (os.O_CREAT | os.O_EXCL,)
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
76 free_close = os.close
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
77
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
78 def __init__(self, original_name):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
79 self.original_name = original_name
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
80
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
81 def is_free(self, filename):
58
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
82 try:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
83 result = self.free_func(filename, *self.free_args)
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
84 except OSError, error:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
85 if error.errno == errno.EEXIST:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
86 return False
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
87 raise
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
88 if self.free_close:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
89 self.free_close(result)
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
90 return True
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
91
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
92 def create(self):
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
93 fd, filename = tempfile.mkstemp(prefix=self.original_name + '.',
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
94 dir='.')
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
95 os.close(fd)
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
96 return filename
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
97
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
98 def check(self):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
99 for suffix in [''] + ['.%s' % (x,) for x in range(1, 10)]:
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
100 filename = '%s%s' % (self.original_name, suffix)
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
101 if self.is_free(filename):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
102 return filename
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
103 return self.create()
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
104
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
105
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
106 class DirectoryChecker(FilenameChecker):
58
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
107 free_func = os.mkdir
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
108 free_args = ()
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
109 free_close = None
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
110
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
111 def create(self):
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
112 return tempfile.mkdtemp(prefix=self.original_name + '.', dir='.')
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
113
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
114
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
115 class ExtractorError(Exception):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
116 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
117
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
118
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
119 class ExtractorUnusable(Exception):
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
120 pass
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
121
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
122
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
123 EXTRACTION_ERRORS = (ExtractorError, ExtractorUnusable, OSError, IOError)
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
124
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
125 class BaseExtractor(object):
34
a8f875e02c83 [svn] Add support for LZMA compression. Holy crap that was easy.
brett
parents: 33
diff changeset
126 decoders = {'bzip2': 'bzcat', 'gzip': 'zcat', 'compress': 'zcat',
a8f875e02c83 [svn] Add support for LZMA compression. Holy crap that was easy.
brett
parents: 33
diff changeset
127 'lzma': 'lzcat'}
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
128
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
129 name_checker = DirectoryChecker
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
130
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
131 def __init__(self, filename, encoding):
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
132 if encoding and (not self.decoders.has_key(encoding)):
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
133 raise ValueError("unrecognized encoding %s" % (encoding,))
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
134 self.filename = os.path.realpath(filename)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
135 self.encoding = encoding
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
136 self.included_archives = []
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
137 self.target = None
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
138 self.content_type = None
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
139 self.content_name = None
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
140 self.pipes = []
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
141 try:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
142 self.archive = open(filename, 'r')
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
143 except (IOError, OSError), error:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
144 raise ExtractorError("could not open %s: %s" %
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
145 (filename, error.strerror))
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
146 if encoding:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
147 self.pipe([self.decoders[encoding]], "decoding")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
148 self.prepare()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
149
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
150 def pipe(self, command, description="extraction"):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
151 self.pipes.append((command, description))
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
152
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
153 def run_pipes(self, final_stdout=None):
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
154 if not self.pipes:
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
155 return
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
156 elif final_stdout is None:
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
157 # FIXME: Buffering this might be dumb.
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
158 final_stdout = tempfile.TemporaryFile()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
159 num_pipes = len(self.pipes)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
160 last_pipe = num_pipes - 1
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
161 processes = []
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
162 for index, command in enumerate([pipe[0] for pipe in self.pipes]):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
163 if index == 0:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
164 stdin = self.archive
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
165 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
166 stdin = processes[-1].stdout
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
167 if index == last_pipe:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
168 stdout = final_stdout
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
169 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
170 stdout = subprocess.PIPE
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
171 try:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
172 processes.append(subprocess.Popen(command, stdin=stdin,
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
173 stdout=stdout,
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
174 stderr=subprocess.PIPE))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
175 except OSError, error:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
176 if error.errno == errno.ENOENT:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
177 raise ExtractorUnusable("could not run %s" % (command[0],))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
178 raise
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
179 exit_codes = [pipe.wait() for pipe in processes]
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
180 self.archive.close()
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
181 for index in range(last_pipe):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
182 processes[index].stdout.close()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
183 processes[index].stderr.close()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
184 for index, status in enumerate(exit_codes):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
185 if status != 0:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
186 raise ExtractorError("%s error: '%s' returned status code %s" %
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
187 (self.pipes[index][1],
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
188 ' '.join(self.pipes[index][0]), status))
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
189 self.archive = final_stdout
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
190
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
191 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
192 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
193
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
194 def check_included_archives(self):
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
195 if (self.content_name is None) or (not self.content_name.endswith('/')):
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
196 self.included_root = './'
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
197 else:
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
198 self.included_root = self.content_name
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
199 start_index = len(self.included_root)
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
200 for path, dirname, filenames in os.walk(self.included_root):
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
201 path = path[start_index:]
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
202 for filename in filenames:
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
203 if (ExtractorBuilder.try_by_mimetype(filename) or
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
204 ExtractorBuilder.try_by_extension(filename)):
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
205 self.included_archives.append(os.path.join(path, filename))
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
206
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
207 def check_contents(self):
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
208 self.contents = os.listdir('.')
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
209 if not self.contents:
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
210 self.content_type = EMPTY
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
211 elif len(self.contents) == 1:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
212 if self.basename() == self.contents[0]:
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
213 self.content_type = MATCHING_DIRECTORY
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
214 else:
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
215 self.content_type = ONE_ENTRY
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
216 self.content_name = self.contents[0]
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
217 if os.path.isdir(self.contents[0]):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
218 self.content_name += '/'
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
219 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
220 self.content_type = BOMB
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
221 self.check_included_archives()
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
222
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
223 def basename(self):
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
224 pieces = os.path.basename(self.filename).split('.')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
225 extension = '.' + pieces[-1]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
226 if mimetypes.encodings_map.has_key(extension):
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
227 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
228 extension = '.' + pieces[-1]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
229 if (mimetypes.types_map.has_key(extension) or
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
230 mimetypes.common_types.has_key(extension) or
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
231 mimetypes.suffix_map.has_key(extension)):
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
232 pieces.pop()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
233 return '.'.join(pieces)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
234
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
235 def extract(self):
40
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
236 try:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
237 self.target = tempfile.mkdtemp(prefix='.dtrx-', dir='.')
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
238 except (OSError, IOError), error:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
239 raise ExtractorError("cannot extract here: %s" % (error.strerror,))
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
240 old_path = os.path.realpath(os.curdir)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
241 os.chdir(self.target)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
242 try:
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
243 self.archive.seek(0, 0)
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
244 self.extract_archive()
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
245 self.check_contents()
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
246 except EXTRACTION_ERRORS:
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
247 self.archive.close()
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
248 os.chdir(old_path)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
249 shutil.rmtree(self.target, ignore_errors=True)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
250 raise
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
251 self.archive.close()
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
252 os.chdir(old_path)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
253
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
254 def get_filenames(self):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
255 self.run_pipes()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
256 self.archive.seek(0, 0)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
257 while True:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
258 line = self.archive.readline()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
259 if not line:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
260 self.archive.close()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
261 return
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
262 yield line.rstrip('\n')
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
263
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
264
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
265 class CompressionExtractor(BaseExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
266 file_type = 'compressed file'
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
267 name_checker = FilenameChecker
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
268
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
269 def basename(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
270 pieces = os.path.basename(self.filename).split('.')
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
271 extension = '.' + pieces[-1]
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
272 if mimetypes.encodings_map.has_key(extension):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
273 pieces.pop()
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
274 return '.'.join(pieces)
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
275
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
276 def get_filenames(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
277 yield self.basename()
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
278
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
279 def extract(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
280 self.content_type = ONE_ENTRY_KNOWN
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
281 self.content_name = self.basename()
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
282 self.contents = None
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
283 self.included_root = './'
40
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
284 try:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
285 output_fd, self.target = tempfile.mkstemp(prefix='.dtrx-', dir='.')
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
286 except (OSError, IOError), error:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
287 raise ExtractorError("cannot extract here: %s" % (error.strerror,))
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
288 try:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
289 self.run_pipes(output_fd)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
290 except EXTRACTION_ERRORS:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
291 os.close(output_fd)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
292 os.unlink(self.target)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
293 raise
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
294 os.close(output_fd)
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
295
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
296
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
297 class TarExtractor(BaseExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
298 file_type = 'tar file'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
299
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
300 def get_filenames(self):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
301 self.pipe(['tar', '-t'], "listing")
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
302 return BaseExtractor.get_filenames(self)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
303
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
304 def extract_archive(self):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
305 self.pipe(['tar', '-x'])
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
306 self.run_pipes()
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
307
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
308
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
309 class CpioExtractor(BaseExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
310 file_type = 'cpio file'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
311
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
312 def get_filenames(self):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
313 self.pipe(['cpio', '-t'], "listing")
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
314 return BaseExtractor.get_filenames(self)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
315
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
316 def extract_archive(self):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
317 self.pipe(['cpio', '-i', '--make-directories',
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
318 '--no-absolute-filenames'])
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
319 self.run_pipes()
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
320
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
321
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
322 class RPMExtractor(CpioExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
323 file_type = 'RPM'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
324
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
325 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
326 self.pipe(['rpm2cpio', '-'], "rpm2cpio")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
327
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
328 def basename(self):
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
329 pieces = os.path.basename(self.filename).split('.')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
330 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
331 return pieces[0]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
332 elif pieces[-1] != 'rpm':
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
333 return BaseExtractor.basename(self)
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
334 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
335 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
336 return pieces[0]
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
337 elif len(pieces[-1]) < 8:
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
338 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
339 return '.'.join(pieces)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
340
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
341 def check_contents(self):
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
342 self.check_included_archives()
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
343 self.content_type = BOMB
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
344
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
345
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
346 class DebExtractor(TarExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
347 file_type = 'Debian package'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
348
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
349 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
350 self.pipe(['ar', 'p', self.filename, 'data.tar.gz'],
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
351 "data.tar.gz extraction")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
352 self.pipe(['zcat'], "data.tar.gz decompression")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
353
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
354 def basename(self):
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
355 pieces = os.path.basename(self.filename).split('_')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
356 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
357 return pieces[0]
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
358 last_piece = pieces.pop()
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
359 if (len(last_piece) > 10) or (not last_piece.endswith('.deb')):
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
360 return BaseExtractor.basename(self)
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
361 return '_'.join(pieces)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
362
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
363 def check_contents(self):
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
364 self.check_included_archives()
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
365 self.content_type = BOMB
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
366
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
367
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
368 class DebMetadataExtractor(DebExtractor):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
369 def prepare(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
370 self.pipe(['ar', 'p', self.filename, 'control.tar.gz'],
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
371 "control.tar.gz extraction")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
372 self.pipe(['zcat'], "control.tar.gz decompression")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
373
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
374
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
375 class GemExtractor(TarExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
376 file_type = 'Ruby gem'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
377
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
378 def prepare(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
379 self.pipe(['tar', '-xO', 'data.tar.gz'], "data.tar.gz extraction")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
380 self.pipe(['zcat'], "data.tar.gz decompression")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
381
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
382 def check_contents(self):
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
383 self.check_included_archives()
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
384 self.content_type = BOMB
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
385
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
386
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
387 class GemMetadataExtractor(CompressionExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
388 file_type = 'Ruby gem'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
389
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
390 def prepare(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
391 self.pipe(['tar', '-xO', 'metadata.gz'], "metadata.gz extraction")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
392 self.pipe(['zcat'], "metadata.gz decompression")
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
393
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
394 def basename(self):
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
395 return os.path.basename(self.filename) + '-metadata.txt'
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
396
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
397
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
398 class NoPipeExtractor(BaseExtractor):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
399 # Some extraction tools won't accept the archive from stdin. With
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
400 # these, the piping infrastructure we normally set up generally doesn't
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
401 # work, at least at first. We can still use most of it; we just don't
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
402 # want to seed self.archive with the archive file, since that sucks up
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
403 # memory. So instead we seed it with /dev/null, and specify the
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
404 # filename on the command line as necessary. We also open the actual
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
405 # file with os.open, to make sure we can actually do it (permissions
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
406 # are good, etc.). This class doesn't do anything by itself; it's just
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
407 # meant to be a base class for extractors that rely on these dumb
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
408 # tools.
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
409 def __init__(self, filename, encoding):
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
410 os.close(os.open(filename, os.O_RDONLY))
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
411 BaseExtractor.__init__(self, '/dev/null', None)
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
412 self.filename = os.path.realpath(filename)
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
413
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
414
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
415 class ZipExtractor(NoPipeExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
416 file_type = 'Zip file'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
417
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
418 def get_filenames(self):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
419 self.pipe(['zipinfo', '-1', self.filename], "listing")
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
420 return BaseExtractor.get_filenames(self)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
421
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
422 def extract_archive(self):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
423 self.pipe(['unzip', '-q', self.filename])
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
424 self.run_pipes()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
425
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
426
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
427 class SevenExtractor(NoPipeExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
428 file_type = '7z file'
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
429 border_re = re.compile('^[- ]+$')
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
430
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
431 def get_filenames(self):
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
432 self.pipe(['7z', 'l', self.filename], "listing")
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
433 self.run_pipes()
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
434 self.archive.seek(0, 0)
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
435 fn_index = None
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
436 for line in self.archive:
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
437 if self.border_re.match(line):
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
438 if fn_index is not None:
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
439 break
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
440 else:
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
441 fn_index = line.rindex(' ') + 1
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
442 elif fn_index is not None:
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
443 yield line[fn_index:-1]
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
444 self.archive.close()
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
445
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
446 def extract_archive(self):
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
447 self.pipe(['7z', 'x', self.filename])
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
448 self.run_pipes()
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
449
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
450
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
451 class CABExtractor(NoPipeExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
452 file_type = 'CAB archive'
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
453 border_re = re.compile(r'^[-\+]+$')
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
454
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
455 def get_filenames(self):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
456 self.pipe(['cabextract', '-l', self.filename], "listing")
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
457 self.run_pipes()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
458 self.archive.seek(0, 0)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
459 fn_index = None
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
460 for line in self.archive:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
461 if self.border_re.match(line):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
462 break
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
463 for line in self.archive:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
464 try:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
465 yield line.split(' | ', 2)[2].rstrip('\n')
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
466 except IndexError:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
467 break
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
468 self.archive.close()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
469
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
470 def extract_archive(self):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
471 self.pipe(['cabextract', '-q', self.filename])
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
472 self.run_pipes()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
473
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
474
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
475 class BaseHandler(object):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
476 def __init__(self, extractor, options):
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
477 self.extractor = extractor
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
478 self.options = options
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
479 self.target = None
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
480
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
481 def handle(self):
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
482 command = 'find'
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
483 status = subprocess.call(['find', self.extractor.target, '-type', 'd',
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
484 '-exec', 'chmod', 'u+rwx', '{}', ';'])
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
485 if status == 0:
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
486 command = 'chmod'
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
487 status = subprocess.call(['chmod', '-R', 'u+rwX',
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
488 self.extractor.target])
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
489 if status != 0:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
490 return "%s returned with exit status %s" % (command, status)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
491 return self.organize()
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
492
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
493 def set_target(self, target, checker):
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
494 self.target = checker(target).check()
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
495 if self.target != target:
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
496 logger.warning("extracting %s to %s" %
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
497 (self.extractor.filename, self.target))
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
498
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
499
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
500 # The "where to extract" table, with options and archive types.
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
501 # This dictates the contents of each can_handle method.
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
502 #
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
503 # Flat Overwrite None
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
504 # File basename basename FilenameChecked
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
505 # Match . . tempdir + checked
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
506 # Bomb . basename DirectoryChecked
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
507
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
508 class FlatHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
509 def can_handle(contents, options):
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
510 return ((options.flat and (contents != ONE_ENTRY_KNOWN)) or
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
511 (options.overwrite and (contents == MATCHING_DIRECTORY)))
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
512 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
513
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
514 def organize(self):
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
515 self.target = '.'
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
516 for curdir, dirs, filenames in os.walk(self.extractor.target,
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
517 topdown=False):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
518 path_parts = curdir.split(os.sep)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
519 if path_parts[0] == '.':
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
520 del path_parts[1]
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
521 else:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
522 del path_parts[0]
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
523 newdir = os.path.join(*path_parts)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
524 if not os.path.isdir(newdir):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
525 os.makedirs(newdir)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
526 for filename in filenames:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
527 os.rename(os.path.join(curdir, filename),
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
528 os.path.join(newdir, filename))
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
529 os.rmdir(curdir)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
530
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
531
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
532 class OverwriteHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
533 def can_handle(contents, options):
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
534 return ((options.flat and (contents == ONE_ENTRY_KNOWN)) or
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
535 (options.overwrite and (contents != MATCHING_DIRECTORY)))
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
536 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
537
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
538 def organize(self):
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
539 self.target = self.extractor.basename()
51
f1789e6586d8 [svn] Don't try to rmtree when overwriting just a file.
brett
parents: 49
diff changeset
540 if os.path.isdir(self.target):
f1789e6586d8 [svn] Don't try to rmtree when overwriting just a file.
brett
parents: 49
diff changeset
541 shutil.rmtree(self.target)
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
542 os.rename(self.extractor.target, self.target)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
543
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
544
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
545 class MatchHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
546 def can_handle(contents, options):
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
547 return ((contents == MATCHING_DIRECTORY) or
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
548 ((contents == ONE_ENTRY) and
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
549 options.one_entry_policy.ok_for_match()))
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
550 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
551
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
552 def organize(self):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
553 source = os.path.join(self.extractor.target,
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
554 os.listdir(self.extractor.target)[0])
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
555 if os.path.isdir(source):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
556 checker = DirectoryChecker
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
557 else:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
558 checker = FilenameChecker
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
559 if self.options.one_entry_policy == EXTRACT_HERE:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
560 destination = self.extractor.content_name.rstrip('/')
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
561 else:
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
562 destination = self.extractor.basename()
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
563 self.set_target(destination, checker)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
564 if os.path.isdir(self.extractor.target):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
565 os.rename(source, self.target)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
566 os.rmdir(self.extractor.target)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
567 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
568 os.rename(self.extractor.target, self.target)
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
569 self.extractor.included_root = './'
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
570
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
571
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
572 class EmptyHandler(object):
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
573 def can_handle(contents, options):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
574 return contents == EMPTY
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
575 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
576
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
577 def __init__(self, extractor, options): pass
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
578 def handle(self): pass
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
579
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
580
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
581 class BombHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
582 def can_handle(contents, options):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
583 return True
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
584 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
585
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
586 def organize(self):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
587 basename = self.extractor.basename()
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
588 self.set_target(basename, self.extractor.name_checker)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
589 os.rename(self.extractor.target, self.target)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
590
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
591
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
592 class BasePolicy(object):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
593 def __init__(self, options):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
594 self.current_policy = None
26
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
595 if options.batch:
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
596 self.permanent_policy = self.answers['']
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
597 else:
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
598 self.permanent_policy = None
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
599
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
600 def ask_question(self, question):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
601 question = textwrap.wrap(question) + self.choices
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
602 while True:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
603 print "\n".join(question)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
604 try:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
605 answer = raw_input(self.prompt)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
606 except EOFError:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
607 return self.answers['']
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
608 try:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
609 return self.answers[answer.lower()]
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
610 except KeyError:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
611 print
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
612
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
613 def __cmp__(self, other):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
614 return cmp(self.current_policy, other)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
615
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
616
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
617 class OneEntryPolicy(BasePolicy):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
618 answers = {'h': EXTRACT_HERE, 'i': EXTRACT_WRAP, 'r': EXTRACT_RENAME,
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
619 '': EXTRACT_WRAP}
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
620 choices = ["You can:",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
621 " * extract it Inside another directory",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
622 " * extract it and Rename the directory",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
623 " * extract it Here"]
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
624 prompt = "What do you want to do? (I/r/h) "
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
625
26
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
626 def prep(self, archive_filename, entry_name):
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
627 question = ("%s contains one entry: %s." %
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
628 (archive_filename, entry_name))
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
629 self.current_policy = (self.permanent_policy or
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
630 self.ask_question(question))
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
631
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
632 def ok_for_match(self):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
633 return self.current_policy in (EXTRACT_RENAME, EXTRACT_HERE)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
634
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
635
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
636 class RecursionPolicy(BasePolicy):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
637 answers = {'o': RECURSE_ONCE, 'a': RECURSE_ALWAYS, 'n': RECURSE_NOT_NOW,
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
638 'v': RECURSE_NEVER, 'l': RECURSE_LIST, '': RECURSE_NOT_NOW}
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
639 choices = ["You can:",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
640 " * Always extract included archives",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
641 " * extract included archives this Once",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
642 " * choose Not to extract included archives",
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
643 " * neVer extract included archives",
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
644 " * List included archives"]
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
645 prompt = "What do you want to do? (a/o/N/v/l) "
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
646
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
647 def __init__(self, options):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
648 BasePolicy.__init__(self, options)
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
649 if options.show_list:
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
650 self.permanent_policy = RECURSE_NEVER
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
651 elif options.recursive:
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
652 self.permanent_policy = RECURSE_ALWAYS
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
653
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
654 def prep(self, current_filename, target, extractor):
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
655 archive_count = len(extractor.included_archives)
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
656 if (self.permanent_policy is not None) or (archive_count == 0):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
657 self.current_policy = self.permanent_policy or RECURSE_NOT_NOW
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
658 return
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
659 elif archive_count > 1:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
660 question = ("%s contains %s other archive files." %
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
661 (current_filename, archive_count))
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
662 else:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
663 question = ("%s contains another archive: %s." %
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
664 (current_filename, extractor.included_archives[0]))
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
665 if target == '.':
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
666 target = ''
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
667 included_root = extractor.included_root
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
668 if included_root == './':
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
669 included_root = ''
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
670 while True:
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
671 self.current_policy = self.ask_question(question)
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
672 if self.current_policy != RECURSE_LIST:
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
673 break
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
674 print ("\n%s\n" %
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
675 '\n'.join([os.path.join(target, included_root, filename)
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
676 for filename in extractor.included_archives]))
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
677 if self.current_policy in (RECURSE_ALWAYS, RECURSE_NEVER):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
678 self.permanent_policy = self.current_policy
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
679
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
680 def ok_to_recurse(self):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
681 return self.current_policy in (RECURSE_ALWAYS, RECURSE_ONCE)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
682
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
683
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
684 class ExtractorBuilder(object):
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
685 extractor_map = {'tar': (TarExtractor, None),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
686 'zip': (ZipExtractor, None),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
687 'deb': (DebExtractor, DebMetadataExtractor),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
688 'rpm': (RPMExtractor, None),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
689 'cpio': (CpioExtractor, None),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
690 'gem': (GemExtractor, GemMetadataExtractor),
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
691 'compress': (CompressionExtractor, None),
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
692 '7z': (SevenExtractor, None),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
693 'cab': (CABExtractor, None)}
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
694
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
695 mimetype_map = {}
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
696 for mapping in (('tar', 'x-tar'),
59
7a0aafe2fe87 [svn] Find self-extracting archives by their file magic only, not extension/mimetype.
brett
parents: 58
diff changeset
697 ('zip', 'zip'),
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
698 ('deb', 'x-debian-package'),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
699 ('rpm', 'x-redhat-package-manager', 'x-rpm'),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
700 ('cpio', 'x-cpio'),
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
701 ('gem', 'x-ruby-gem'),
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
702 ('7z', 'x-7z-compressed'),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
703 ('cab', 'x-cab')):
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
704 for mimetype in mapping[1:]:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
705 if '/' not in mimetype:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
706 mimetype = 'application/' + mimetype
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
707 mimetype_map[mimetype] = mapping[0]
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
708
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
709 magic_mime_map = {}
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
710 for mapping in (('deb', 'Debian binary package'),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
711 ('cpio', 'cpio archive'),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
712 ('tar', 'POSIX tar archive'),
59
7a0aafe2fe87 [svn] Find self-extracting archives by their file magic only, not extension/mimetype.
brett
parents: 58
diff changeset
713 ('zip', '(Zip|ZIP self-extracting) archive'),
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
714 ('rpm', 'RPM'),
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
715 ('7z', '7-zip archive'),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
716 ('cab', 'Microsoft Cabinet archive')):
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
717 for pattern in mapping[1:]:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
718 magic_mime_map[re.compile(pattern)] = mapping[0]
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
719
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
720 magic_encoding_map = {}
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
721 for mapping in (('bzip2', 'bzip2 compressed'),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
722 ('gzip', 'gzip compressed')):
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
723 for pattern in mapping[1:]:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
724 magic_encoding_map[re.compile(pattern)] = mapping[0]
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
725
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
726 extension_map = {}
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
727 for mapping in (('tar', 'bzip2', 'tar.bz2'),
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
728 ('tar', 'gzip', 'tar.gz', 'tgz'),
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
729 ('tar', None, 'tar'),
59
7a0aafe2fe87 [svn] Find self-extracting archives by their file magic only, not extension/mimetype.
brett
parents: 58
diff changeset
730 ('zip', None, 'zip'),
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
731 ('deb', None, 'deb'),
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
732 ('rpm', None, 'rpm'),
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
733 ('cpio', None, 'cpio'),
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
734 ('gem', None, 'gem'),
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
735 ('compress', 'gzip', 'Z', 'gz'),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
736 ('compress', 'bzip2', 'bz2'),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
737 ('compress', 'lzma', 'lzma'),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
738 ('7z', None, '7z'),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
739 ('cab', None, 'cab', 'exe')):
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
740 for extension in mapping[2:]:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
741 extension_map.setdefault(extension, []).append(mapping[:2])
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
742
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
743 def __init__(self, filename, options):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
744 self.filename = filename
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
745 self.options = options
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
746
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
747 def build_extractor(self, archive_type, encoding):
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
748 extractors = self.extractor_map[archive_type]
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
749 if self.options.metadata and (extractors[1] is not None):
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
750 extractor = extractors[1]
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
751 else:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
752 extractor = extractors[0]
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
753 return extractor(self.filename, encoding)
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
754
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
755 def get_extractor(self):
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
756 tried_types = set()
36
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
757 # As smart as it is, the magic test can't go first, because at least
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
758 # on my system it just recognizes gem files as tar files. I guess
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
759 # it's possible for the opposite problem to occur -- where the mimetype
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
760 # or extension suggests something less than ideal -- but it seems less
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
761 # likely so I'm sticking with this.
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
762 for func_name in ('mimetype', 'extension', 'magic'):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
763 logger.debug("getting extractors by %s" % (func_name,))
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
764 extractor_types = \
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
765 getattr(self, 'try_by_' + func_name)(self.filename)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
766 logger.debug("done getting extractors")
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
767 for ext_args in extractor_types:
36
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
768 if ext_args in tried_types:
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
769 continue
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
770 tried_types.add(ext_args)
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
771 logger.debug("trying %s extractor from %s" %
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
772 (ext_args, func_name))
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
773 yield self.build_extractor(*ext_args)
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
774
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
775 def try_by_mimetype(cls, filename):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
776 mimetype, encoding = mimetypes.guess_type(filename)
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
777 try:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
778 return [(cls.mimetype_map[mimetype], encoding)]
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
779 except KeyError:
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
780 if encoding:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
781 return [('compress', encoding)]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
782 return []
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
783 try_by_mimetype = classmethod(try_by_mimetype)
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
784
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
785 def magic_map_matches(cls, output, magic_map):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
786 return [result for regexp, result in magic_map.items()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
787 if regexp.search(output)]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
788 magic_map_matches = classmethod(magic_map_matches)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
789
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
790 def try_by_magic(cls, filename):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
791 process = subprocess.Popen(['file', '-z', filename],
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
792 stdout=subprocess.PIPE)
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
793 status = process.wait()
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
794 if status != 0:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
795 return []
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
796 output = process.stdout.readline()
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
797 process.stdout.close()
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
798 if output.startswith('%s: ' % filename):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
799 output = output[len(filename) + 2:]
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
800 mimes = cls.magic_map_matches(output, cls.magic_mime_map)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
801 encodings = cls.magic_map_matches(output, cls.magic_encoding_map)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
802 if mimes and not encodings:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
803 encodings = [None]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
804 elif encodings and not mimes:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
805 mimes = ['compress']
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
806 return [(m, e) for m in mimes for e in encodings]
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
807 try_by_magic = classmethod(try_by_magic)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
808
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
809 def try_by_extension(cls, filename):
43
4591a32eedc8 [svn] Sadly Python 2.3 does not have an rsplit method on strings.
brett
parents: 42
diff changeset
810 parts = filename.split('.')[-2:]
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
811 results = []
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
812 while parts:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
813 results.extend(cls.extension_map.get('.'.join(parts), []))
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
814 del parts[0]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
815 return results
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
816 try_by_extension = classmethod(try_by_extension)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
817
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
818
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
819 class BaseAction(object):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
820 def __init__(self, options, filenames):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
821 self.options = options
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
822 self.filenames = filenames
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
823 self.target = None
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
824
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
825 def report(self, function, *args):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
826 try:
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
827 error = function(*args)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
828 except EXTRACTION_ERRORS, exception:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
829 error = str(exception)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
830 logger.debug(''.join(traceback.format_exception(*sys.exc_info())))
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
831 return error
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
832
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
833
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
834 class ExtractionAction(BaseAction):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
835 handlers = [FlatHandler, OverwriteHandler, MatchHandler, EmptyHandler,
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
836 BombHandler]
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
837
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
838 def __init__(self, options, filenames):
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
839 BaseAction.__init__(self, options, filenames)
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
840 self.did_print = False
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
841
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
842 def get_handler(self, extractor):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
843 if extractor.content_type == ONE_ENTRY:
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
844 self.options.one_entry_policy.prep(self.current_filename,
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
845 extractor.content_name)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
846 for handler in self.handlers:
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
847 if handler.can_handle(extractor.content_type, self.options):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
848 logger.debug("using %s handler" % (handler.__name__,))
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
849 self.current_handler = handler(extractor, self.options)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
850 break
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
851
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
852 def show_extraction(self, extractor):
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
853 if self.options.log_level > logging.INFO:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
854 return
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
855 elif self.did_print:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
856 print
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
857 else:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
858 self.did_print = True
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
859 print "%s:" % (self.current_filename,)
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
860 if extractor.contents is None:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
861 print self.current_handler.target
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
862 return
55
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
863 def reverser(x, y):
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
864 return cmp(y, x)
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
865 if self.current_handler.target == '.':
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
866 filenames = extractor.contents
55
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
867 filenames.sort(reverser)
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
868 else:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
869 filenames = [self.current_handler.target]
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
870 pathjoin = os.path.join
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
871 isdir = os.path.isdir
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
872 while filenames:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
873 filename = filenames.pop()
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
874 if isdir(filename):
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
875 print "%s/" % (filename,)
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
876 new_filenames = os.listdir(filename)
55
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
877 new_filenames.sort(reverser)
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
878 filenames.extend([pathjoin(filename, new_filename)
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
879 for new_filename in new_filenames])
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
880 else:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
881 print filename
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
882
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
883 def run(self, filename, extractor):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
884 self.current_filename = filename
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
885 error = (self.report(extractor.extract) or
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
886 self.report(self.get_handler, extractor) or
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
887 self.report(self.current_handler.handle) or
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
888 self.report(self.show_extraction, extractor))
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
889 if not error:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
890 self.target = self.current_handler.target
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
891 return error
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
892
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
893
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
894 class ListAction(BaseAction):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
895 def __init__(self, options, filenames):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
896 BaseAction.__init__(self, options, filenames)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
897 self.count = 0
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
898
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
899 def get_list(self, extractor):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
900 # Note: The reason I'm getting all the filenames up front is
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
901 # because if we run into trouble partway through the archive, we'll
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
902 # try another extractor. So before we display anything we have to
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
903 # be sure this one is successful. We maybe don't have to be quite
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
904 # this conservative but this is the easy way out for now.
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
905 self.filelist = list(extractor.get_filenames())
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
906
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
907 def show_list(self, filename):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
908 self.count += 1
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
909 if len(self.filenames) != 1:
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
910 if self.count > 1:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
911 print
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
912 print "%s:" % (filename,)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
913 print '\n'.join(self.filelist)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
914
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
915 def run(self, filename, extractor):
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
916 return (self.report(self.get_list, extractor) or
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
917 self.report(self.show_list, filename))
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
918
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
919
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
920 class ExtractorApplication(object):
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
921 def __init__(self, arguments):
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
922 for signal_num in (signal.SIGINT, signal.SIGTERM):
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
923 signal.signal(signal_num, self.abort)
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
924 self.parse_options(arguments)
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
925 self.setup_logger()
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
926 self.successes = []
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
927 self.failures = []
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
928
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
929 def abort(self, signal_num, frame):
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
930 signal.signal(signal_num, signal.SIG_IGN)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
931 print
49
c76dd2716113 [svn] Add a traceback when we catch a signal.
brett
parents: 48
diff changeset
932 logger.debug("traceback:\n" +
c76dd2716113 [svn] Add a traceback when we catch a signal.
brett
parents: 48
diff changeset
933 ''.join(traceback.format_stack(frame)).rstrip())
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
934 logger.debug("got signal %s; cleaning up" % (signal_num,))
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
935 clean_targets = set([os.path.realpath('.')])
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
936 if hasattr(self, 'current_directory'):
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
937 clean_targets.add(os.path.realpath(self.current_directory))
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
938 for directory in clean_targets:
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
939 os.chdir(directory)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
940 for path in glob.glob('.dtrx-*'):
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
941 try:
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
942 os.unlink(path)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
943 except OSError, error:
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
944 if error.errno == errno.EISDIR:
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
945 shutil.rmtree(path, ignore_errors=True)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
946 sys.exit(1)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
947
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
948 def parse_options(self, arguments):
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
949 parser = optparse.OptionParser(
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
950 usage="%prog [options] archive [archive2 ...]",
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
951 description="Intelligent archive extractor",
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
952 version=VERSION_BANNER
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
953 )
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
954 parser.add_option('-r', '--recursive', dest='recursive',
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
955 action='store_true', default=False,
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
956 help='extract archives contained in the ones listed')
13
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
957 parser.add_option('-q', '--quiet', dest='quiet',
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
958 action='count', default=3,
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
959 help='suppress warning/error messages')
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
960 parser.add_option('-v', '--verbose', dest='verbose',
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
961 action='count', default=0,
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
962 help='be verbose/print debugging information')
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
963 parser.add_option('-o', '--overwrite', dest='overwrite',
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
964 action='store_true', default=False,
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
965 help='overwrite any existing target directory')
15
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
966 parser.add_option('-f', '--flat', '--no-directory', dest='flat',
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
967 action='store_true', default=False,
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
968 help="don't put contents in their own directory")
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
969 parser.add_option('-l', '-t', '--list', '--table', dest='show_list',
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
970 action='store_true', default=False,
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
971 help="list contents of archives on standard output")
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
972 parser.add_option('-n', '--noninteractive', dest='batch',
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
973 action='store_true', default=False,
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
974 help="don't ask how to handle special cases")
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
975 parser.add_option('-m', '--metadata', dest='metadata',
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
976 action='store_true', default=False,
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
977 help="extract metadata from a .deb/.gem")
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
978 self.options, filenames = parser.parse_args(arguments)
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
979 if not filenames:
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
980 parser.error("you did not list any archives")
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
981 # This makes WARNING is the default.
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
982 self.options.log_level = (10 * (self.options.quiet -
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
983 self.options.verbose))
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
984 self.options.one_entry_policy = OneEntryPolicy(self.options)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
985 self.options.recursion_policy = RecursionPolicy(self.options)
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
986 self.archives = {os.path.realpath(os.curdir): filenames}
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
987
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
988 def setup_logger(self):
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
989 logging.getLogger().setLevel(self.options.log_level)
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
990 handler = logging.StreamHandler()
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
991 handler.setLevel(self.options.log_level)
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
992 formatter = logging.Formatter("dtrx: %(levelname)s: %(message)s")
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
993 handler.setFormatter(formatter)
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
994 logger.addHandler(handler)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
995 logger.debug("logger is set up")
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
996
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
997 def recurse(self, filename, extractor, action):
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
998 self.options.recursion_policy.prep(filename, action.target, extractor)
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
999 if self.options.recursion_policy.ok_to_recurse():
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
1000 for filename in extractor.included_archives:
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
1001 tail_path, basename = os.path.split(filename)
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
1002 directory = os.path.join(self.current_directory, action.target,
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
1003 extractor.included_root, tail_path)
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
1004 self.archives.setdefault(directory, []).append(basename)
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
1005
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1006 def check_file(self, filename):
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1007 try:
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1008 result = os.stat(filename)
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1009 except OSError, error:
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1010 return error.strerror
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1011 if stat.S_ISDIR(result.st_mode):
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1012 return "cannot extract a directory"
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1013
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1014 def try_extractors(self, filename, builder):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1015 errors = []
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1016 for extractor in builder:
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1017 error = self.action.run(filename, extractor)
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1018 if error:
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1019 errors.append((extractor.file_type, extractor.encoding, error))
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1020 else:
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1021 self.recurse(filename, extractor, self.action)
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1022 return
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1023 logger.error("could not handle %s" % (filename,))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1024 if not errors:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1025 logger.error("not a known archive type")
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1026 return True
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1027 for file_type, encoding, error in errors:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1028 message = ["treating as", file_type, "failed:", error]
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1029 if encoding:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1030 message.insert(1, "%s-encoded" % (encoding,))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1031 logger.error(' '.join(message))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1032 return True
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1033
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1034 def run(self):
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1035 if self.options.show_list:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1036 action = ListAction
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1037 else:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1038 action = ExtractionAction
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1039 self.action = action(self.options, self.archives.values()[0])
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1040 while self.archives:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1041 self.current_directory, self.filenames = self.archives.popitem()
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1042 os.chdir(self.current_directory)
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1043 for filename in self.filenames:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1044 builder = ExtractorBuilder(filename, self.options)
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1045 error = (self.check_file(filename) or
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1046 self.try_extractors(filename, builder.get_extractor()))
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1047 if error:
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1048 if error != True:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1049 logger.error("%s: %s" % (filename, error))
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1050 self.failures.append(filename)
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1051 else:
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1052 self.successes.append(filename)
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1053 self.options.one_entry_policy.permanent_policy = EXTRACT_WRAP
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1054 if self.failures:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1055 return 1
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1056 return 0
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1057
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1058
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1059 if __name__ == '__main__':
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1060 app = ExtractorApplication(sys.argv[1:])
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1061 sys.exit(app.run())

mercurial