scripts/dtrx

Thu, 24 Jul 2008 22:35:38 -0400

author
Brett Smith <brettcsmith@brettcsmith.org>
date
Thu, 24 Jul 2008 22:35:38 -0400
branch
trunk
changeset 85
ad73f75c9046
parent 84
d78d63cb4c4e
child 86
e02ca4e9bf42
permissions
-rwxr-xr-x

Use the default action for SIGPIPE.

Using Python's built-in handler makes weird error messages for the user,
and there's no reason not to act like every other Unix program in this
regard. This seems particularly true since we're most likely to get
SIGPIPE with dtrx -l, where we won't write things to disk etc.

1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1 #!/usr/bin/env python
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
2 #
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
3 # dtrx -- Intelligently extract various archive types.
54
cd43d2f61162 [svn] Update copyright dates in the license headers.
brett
parents: 53
diff changeset
4 # Copyright (c) 2006, 2007, 2008 Brett Smith <brettcsmith@brettcsmith.org>.
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
5 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
6 # This program is free software; you can redistribute it and/or modify it
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
7 # under the terms of the GNU General Public License as published by the
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
8 # Free Software Foundation; either version 3 of the License, or (at your
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
9 # option) any later version.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
10 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
11 # This program is distributed in the hope that it will be useful, but
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
12 # WITHOUT ANY WARRANTY; without even the implied warranty of
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
13 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
14 # Public License for more details.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
15 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
16 # You should have received a copy of the GNU General Public License along
42
4a4cab75d5e6 [svn] Update documentation.
brett
parents: 41
diff changeset
17 # with this program; if not, see <http://www.gnu.org/licenses/>.
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
18
72
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
19 # Python 2.3 string methods: 'rfind', 'rindex', 'rjust', 'rstrip'
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
20
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
21 import errno
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
22 import glob
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
23 import logging
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
24 import mimetypes
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
25 import optparse
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
26 import os
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
27 import re
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
28 import shutil
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
29 import signal
15
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
30 import stat
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
31 import subprocess
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
32 import sys
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
33 import tempfile
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
34 import textwrap
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
35 import traceback
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
36
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
37 from sets import Set as set
36
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
38
74
dd577317bccb Updates for 6.1 release.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 73
diff changeset
39 VERSION = "6.1"
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
40 VERSION_BANNER = """dtrx version %s
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
41 Copyright (c) 2006, 2007, 2008 Brett Smith <brettcsmith@brettcsmith.org>
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
42
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
43 This program is free software; you can redistribute it and/or modify it
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
44 under the terms of the GNU General Public License as published by the
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
45 Free Software Foundation; either version 3 of the License, or (at your
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
46 option) any later version.
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
47
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
48 This program is distributed in the hope that it will be useful, but
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
49 WITHOUT ANY WARRANTY; without even the implied warranty of
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
50 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
51 Public License for more details.""" % (VERSION,)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
52
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
53 MATCHING_DIRECTORY = 1
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
54 ONE_ENTRY_KNOWN = 2
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
55 BOMB = 3
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
56 EMPTY = 4
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
57 ONE_ENTRY_FILE = 'file'
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
58 ONE_ENTRY_DIRECTORY = 'directory'
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
59
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
60 ONE_ENTRY_UNKNOWN = [ONE_ENTRY_FILE, ONE_ENTRY_DIRECTORY]
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
61
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
62 EXTRACT_HERE = 1
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
63 EXTRACT_WRAP = 2
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
64 EXTRACT_RENAME = 3
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
65
23
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
66 RECURSE_ALWAYS = 1
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
67 RECURSE_ONCE = 2
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
68 RECURSE_NOT_NOW = 3
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
69 RECURSE_NEVER = 4
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
70 RECURSE_LIST = 5
23
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
71
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
72 mimetypes.encodings_map.setdefault('.bz2', 'bzip2')
34
a8f875e02c83 [svn] Add support for LZMA compression. Holy crap that was easy.
brett
parents: 33
diff changeset
73 mimetypes.encodings_map.setdefault('.lzma', 'lzma')
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
74 mimetypes.types_map.setdefault('.gem', 'application/x-ruby-gem')
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
75
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
76 logger = logging.getLogger('dtrx-log')
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
77
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
78 class FilenameChecker(object):
58
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
79 free_func = os.open
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
80 free_args = (os.O_CREAT | os.O_EXCL,)
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
81 free_close = os.close
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
82
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
83 def __init__(self, original_name):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
84 self.original_name = original_name
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
85
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
86 def is_free(self, filename):
58
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
87 try:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
88 result = self.free_func(filename, *self.free_args)
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
89 except OSError, error:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
90 if error.errno == errno.EEXIST:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
91 return False
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
92 raise
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
93 if self.free_close:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
94 self.free_close(result)
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
95 return True
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
96
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
97 def create(self):
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
98 fd, filename = tempfile.mkstemp(prefix=self.original_name + '.',
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
99 dir='.')
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
100 os.close(fd)
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
101 return filename
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
102
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
103 def check(self):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
104 for suffix in [''] + ['.%s' % (x,) for x in range(1, 10)]:
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
105 filename = '%s%s' % (self.original_name, suffix)
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
106 if self.is_free(filename):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
107 return filename
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
108 return self.create()
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
109
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
110
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
111 class DirectoryChecker(FilenameChecker):
58
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
112 free_func = os.mkdir
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
113 free_args = ()
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
114 free_close = None
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
115
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
116 def create(self):
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
117 return tempfile.mkdtemp(prefix=self.original_name + '.', dir='.')
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
118
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
119
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
120 class ExtractorError(Exception):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
121 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
122
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
123
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
124 class ExtractorUnusable(Exception):
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
125 pass
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
126
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
127
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
128 EXTRACTION_ERRORS = (ExtractorError, ExtractorUnusable, OSError, IOError)
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
129
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
130 class BaseExtractor(object):
34
a8f875e02c83 [svn] Add support for LZMA compression. Holy crap that was easy.
brett
parents: 33
diff changeset
131 decoders = {'bzip2': 'bzcat', 'gzip': 'zcat', 'compress': 'zcat',
a8f875e02c83 [svn] Add support for LZMA compression. Holy crap that was easy.
brett
parents: 33
diff changeset
132 'lzma': 'lzcat'}
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
133 name_checker = DirectoryChecker
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
134
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
135 def __init__(self, filename, encoding):
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
136 if encoding and (not self.decoders.has_key(encoding)):
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
137 raise ValueError("unrecognized encoding %s" % (encoding,))
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
138 self.filename = os.path.realpath(filename)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
139 self.encoding = encoding
69
35a2f45cdd3b Count files in the archive and report that in the recursion prompt.
Brett Smith <brett@brettcsmith.org>
parents: 66
diff changeset
140 self.file_count = 0
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
141 self.included_archives = []
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
142 self.target = None
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
143 self.content_type = None
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
144 self.content_name = None
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
145 self.pipes = []
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
146 self.stderr = tempfile.TemporaryFile()
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
147 self.exit_codes = []
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
148 try:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
149 self.archive = open(filename, 'r')
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
150 except (IOError, OSError), error:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
151 raise ExtractorError("could not open %s: %s" %
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
152 (filename, error.strerror))
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
153 if encoding:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
154 self.pipe([self.decoders[encoding]], "decoding")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
155 self.prepare()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
156
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
157 def pipe(self, command, description="extraction"):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
158 self.pipes.append((command, description))
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
159
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
160 def first_bad_exit_code(self):
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
161 for index, code in enumerate(self.exit_codes):
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
162 if code != 0:
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
163 return index
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
164 return None
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
165
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
166 def run_pipes(self, final_stdout=None):
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
167 if not self.pipes:
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
168 return
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
169 elif final_stdout is None:
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
170 # FIXME: Buffering this might be dumb.
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
171 final_stdout = tempfile.TemporaryFile()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
172 num_pipes = len(self.pipes)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
173 last_pipe = num_pipes - 1
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
174 processes = []
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
175 for index, command in enumerate([pipe[0] for pipe in self.pipes]):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
176 if index == 0:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
177 stdin = self.archive
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
178 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
179 stdin = processes[-1].stdout
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
180 if index == last_pipe:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
181 stdout = final_stdout
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
182 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
183 stdout = subprocess.PIPE
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
184 try:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
185 processes.append(subprocess.Popen(command, stdin=stdin,
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
186 stdout=stdout,
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
187 stderr=self.stderr))
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
188 except OSError, error:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
189 if error.errno == errno.ENOENT:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
190 raise ExtractorUnusable("could not run %s" % (command[0],))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
191 raise
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
192 self.exit_codes = [pipe.wait() for pipe in processes]
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
193 self.archive.close()
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
194 for index in range(last_pipe):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
195 processes[index].stdout.close()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
196 self.archive = final_stdout
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
197
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
198 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
199 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
200
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
201 def check_included_archives(self):
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
202 if (self.content_name is None) or (not self.content_name.endswith('/')):
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
203 self.included_root = './'
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
204 else:
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
205 self.included_root = self.content_name
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
206 start_index = len(self.included_root)
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
207 for path, dirname, filenames in os.walk(self.included_root):
69
35a2f45cdd3b Count files in the archive and report that in the recursion prompt.
Brett Smith <brett@brettcsmith.org>
parents: 66
diff changeset
208 self.file_count += len(filenames)
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
209 path = path[start_index:]
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
210 for filename in filenames:
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
211 if (ExtractorBuilder.try_by_mimetype(filename) or
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
212 ExtractorBuilder.try_by_extension(filename)):
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
213 self.included_archives.append(os.path.join(path, filename))
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
214
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
215 def check_contents(self):
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
216 self.contents = os.listdir('.')
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
217 if not self.contents:
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
218 self.content_type = EMPTY
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
219 elif len(self.contents) == 1:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
220 if self.basename() == self.contents[0]:
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
221 self.content_type = MATCHING_DIRECTORY
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
222 elif os.path.isdir(self.contents[0]):
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
223 self.content_type = ONE_ENTRY_DIRECTORY
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
224 else:
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
225 self.content_type = ONE_ENTRY_FILE
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
226 self.content_name = self.contents[0]
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
227 if os.path.isdir(self.contents[0]):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
228 self.content_name += '/'
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
229 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
230 self.content_type = BOMB
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
231 self.check_included_archives()
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
232
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
233 def basename(self):
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
234 pieces = os.path.basename(self.filename).split('.')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
235 extension = '.' + pieces[-1]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
236 if mimetypes.encodings_map.has_key(extension):
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
237 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
238 extension = '.' + pieces[-1]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
239 if (mimetypes.types_map.has_key(extension) or
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
240 mimetypes.common_types.has_key(extension) or
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
241 mimetypes.suffix_map.has_key(extension)):
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
242 pieces.pop()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
243 return '.'.join(pieces)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
244
78
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
245 def get_stderr(self):
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
246 self.stderr.seek(0, 0)
78
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
247 errors = self.stderr.read(-1)
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
248 self.stderr.close()
78
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
249 return errors
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
250
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
251 def check_success(self, got_output):
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
252 error_index = self.first_bad_exit_code()
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
253 if (not got_output) and (error_index is not None):
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
254 command = ' '.join(self.pipes[error_index][0])
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
255 raise ExtractorError("%s error: '%s' returned status code %s" %
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
256 (self.pipes[error_index][1], command,
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
257 self.exit_codes[error_index]))
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
258
80
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
259 def extract_archive(self):
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
260 self.pipe(self.extract_pipe)
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
261 self.run_pipes()
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
262
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
263 def extract(self):
40
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
264 try:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
265 self.target = tempfile.mkdtemp(prefix='.dtrx-', dir='.')
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
266 except (OSError, IOError), error:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
267 raise ExtractorError("cannot extract here: %s" % (error.strerror,))
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
268 old_path = os.path.realpath(os.curdir)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
269 os.chdir(self.target)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
270 try:
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
271 self.archive.seek(0, 0)
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
272 self.extract_archive()
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
273 self.check_contents()
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
274 self.check_success(self.content_type != EMPTY)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
275 except EXTRACTION_ERRORS:
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
276 self.archive.close()
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
277 os.chdir(old_path)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
278 shutil.rmtree(self.target, ignore_errors=True)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
279 raise
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
280 self.archive.close()
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
281 os.chdir(old_path)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
282
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
283 def get_filenames(self):
80
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
284 self.pipe(self.list_pipe, "listing")
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
285 self.run_pipes()
79
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
286 self.check_success(False)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
287 self.archive.seek(0, 0)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
288 while True:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
289 line = self.archive.readline()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
290 if not line:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
291 self.archive.close()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
292 return
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
293 yield line.rstrip('\n')
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
294
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
295
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
296 class CompressionExtractor(BaseExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
297 file_type = 'compressed file'
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
298 name_checker = FilenameChecker
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
299
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
300 def basename(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
301 pieces = os.path.basename(self.filename).split('.')
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
302 extension = '.' + pieces[-1]
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
303 if mimetypes.encodings_map.has_key(extension):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
304 pieces.pop()
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
305 return '.'.join(pieces)
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
306
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
307 def get_filenames(self):
79
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
308 # This code used to just immediately yield the basename, under the
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
309 # assumption that that would be the filename. However, if that
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
310 # happens, dtrx -l will report this as a valid result for files with
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
311 # compression extensions, even if those files shouldn't actually be
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
312 # handled this way. So, we call out to the file command to do a quick
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
313 # check and make sure this actually looks like a compressed file.
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
314 if 'compress' not in [match[0] for match in
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
315 ExtractorBuilder.try_by_magic(self.filename)]:
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
316 raise ExtractorError("doesn't look like a compressed file")
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
317 yield self.basename()
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
318
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
319 def extract(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
320 self.content_type = ONE_ENTRY_KNOWN
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
321 self.content_name = self.basename()
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
322 self.contents = None
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
323 self.included_root = './'
40
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
324 try:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
325 output_fd, self.target = tempfile.mkstemp(prefix='.dtrx-', dir='.')
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
326 except (OSError, IOError), error:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
327 raise ExtractorError("cannot extract here: %s" % (error.strerror,))
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
328 self.run_pipes(output_fd)
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
329 os.close(output_fd)
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
330 try:
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
331 self.check_success(os.stat(self.target)[stat.ST_SIZE] > 0)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
332 except EXTRACTION_ERRORS:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
333 os.unlink(self.target)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
334 raise
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
335
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
336 class TarExtractor(BaseExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
337 file_type = 'tar file'
80
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
338 extract_pipe = ['tar', '-x']
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
339 list_pipe = ['tar', '-t']
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
340
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
341
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
342 class CpioExtractor(BaseExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
343 file_type = 'cpio file'
80
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
344 extract_pipe = ['cpio', '-i', '--make-directories', '--quiet',
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
345 '--no-absolute-filenames']
83
cb56c72f3d42 Use the --quiet option for cpio -t too.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 82
diff changeset
346 list_pipe = ['cpio', '-t', '--quiet']
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
347
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
348
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
349 class RPMExtractor(CpioExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
350 file_type = 'RPM'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
351
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
352 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
353 self.pipe(['rpm2cpio', '-'], "rpm2cpio")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
354
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
355 def basename(self):
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
356 pieces = os.path.basename(self.filename).split('.')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
357 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
358 return pieces[0]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
359 elif pieces[-1] != 'rpm':
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
360 return BaseExtractor.basename(self)
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
361 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
362 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
363 return pieces[0]
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
364 elif len(pieces[-1]) < 8:
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
365 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
366 return '.'.join(pieces)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
367
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
368 def check_contents(self):
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
369 self.check_included_archives()
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
370 self.content_type = BOMB
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
371
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
372
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
373 class DebExtractor(TarExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
374 file_type = 'Debian package'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
375
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
376 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
377 self.pipe(['ar', 'p', self.filename, 'data.tar.gz'],
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
378 "data.tar.gz extraction")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
379 self.pipe(['zcat'], "data.tar.gz decompression")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
380
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
381 def basename(self):
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
382 pieces = os.path.basename(self.filename).split('_')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
383 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
384 return pieces[0]
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
385 last_piece = pieces.pop()
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
386 if (len(last_piece) > 10) or (not last_piece.endswith('.deb')):
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
387 return BaseExtractor.basename(self)
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
388 return '_'.join(pieces)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
389
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
390 def check_contents(self):
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
391 self.check_included_archives()
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
392 self.content_type = BOMB
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
393
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
394
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
395 class DebMetadataExtractor(DebExtractor):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
396 def prepare(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
397 self.pipe(['ar', 'p', self.filename, 'control.tar.gz'],
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
398 "control.tar.gz extraction")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
399 self.pipe(['zcat'], "control.tar.gz decompression")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
400
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
401
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
402 class GemExtractor(TarExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
403 file_type = 'Ruby gem'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
404
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
405 def prepare(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
406 self.pipe(['tar', '-xO', 'data.tar.gz'], "data.tar.gz extraction")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
407 self.pipe(['zcat'], "data.tar.gz decompression")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
408
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
409 def check_contents(self):
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
410 self.check_included_archives()
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
411 self.content_type = BOMB
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
412
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
413
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
414 class GemMetadataExtractor(CompressionExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
415 file_type = 'Ruby gem'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
416
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
417 def prepare(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
418 self.pipe(['tar', '-xO', 'metadata.gz'], "metadata.gz extraction")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
419 self.pipe(['zcat'], "metadata.gz decompression")
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
420
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
421 def basename(self):
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
422 return os.path.basename(self.filename) + '-metadata.txt'
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
423
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
424
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
425 class NoPipeExtractor(BaseExtractor):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
426 # Some extraction tools won't accept the archive from stdin. With
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
427 # these, the piping infrastructure we normally set up generally doesn't
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
428 # work, at least at first. We can still use most of it; we just don't
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
429 # want to seed self.archive with the archive file, since that sucks up
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
430 # memory. So instead we seed it with /dev/null, and specify the
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
431 # filename on the command line as necessary. We also open the actual
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
432 # file with os.open, to make sure we can actually do it (permissions
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
433 # are good, etc.). This class doesn't do anything by itself; it's just
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
434 # meant to be a base class for extractors that rely on these dumb
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
435 # tools.
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
436 def __init__(self, filename, encoding):
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
437 os.close(os.open(filename, os.O_RDONLY))
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
438 BaseExtractor.__init__(self, '/dev/null', None)
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
439 self.filename = os.path.realpath(filename)
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
440
80
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
441 def extract_archive(self):
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
442 self.extract_pipe = self.extract_command + [self.filename]
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
443 BaseExtractor.extract_archive(self)
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
444
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
445 def get_filenames(self):
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
446 self.list_pipe = self.list_command + [self.filename]
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
447 return BaseExtractor.get_filenames(self)
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
448
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
449
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
450 class ZipExtractor(NoPipeExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
451 file_type = 'Zip file'
80
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
452 extract_command = ['unzip', '-q']
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
453 list_command = ['zipinfo', '-1']
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
454
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
455
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
456 class SevenExtractor(NoPipeExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
457 file_type = '7z file'
80
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
458 extract_command = ['7z', 'x']
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
459 list_command = ['7z', 'l']
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
460 border_re = re.compile('^[- ]+$')
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
461
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
462 def get_filenames(self):
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
463 fn_index = None
79
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
464 for line in NoPipeExtractor.get_filenames(self):
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
465 if self.border_re.match(line):
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
466 if fn_index is not None:
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
467 break
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
468 else:
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
469 fn_index = line.rindex(' ') + 1
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
470 elif fn_index is not None:
82
6db35db38795 Stop worrying about trailing newlines in get_filenames() overrides.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 81
diff changeset
471 yield line[fn_index:]
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
472 self.archive.close()
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
473
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
474
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
475 class CABExtractor(NoPipeExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
476 file_type = 'CAB archive'
80
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
477 extract_command = ['cabextract', '-q']
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
478 list_command = ['cabextract', '-l']
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
479 border_re = re.compile(r'^[-\+]+$')
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
480
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
481 def get_filenames(self):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
482 fn_index = None
79
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
483 filenames = NoPipeExtractor.get_filenames(self)
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
484 for line in filenames:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
485 if self.border_re.match(line):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
486 break
79
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
487 for line in filenames:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
488 try:
82
6db35db38795 Stop worrying about trailing newlines in get_filenames() overrides.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 81
diff changeset
489 yield line.split(' | ', 2)[2]
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
490 except IndexError:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
491 break
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
492 self.archive.close()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
493
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
494
72
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
495 class ShieldExtractor(NoPipeExtractor):
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
496 file_type = 'InstallShield archive'
80
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
497 extract_command = ['unshield', 'x']
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
498 list_command = ['unshield', 'l']
72
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
499 prefix_re = re.compile(r'^\s+\d+\s+')
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
500 end_re = re.compile(r'^\s+-+\s+-+\s*$')
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
501
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
502 def get_filenames(self):
79
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
503 for line in NoPipeExtractor.get_filenames(self):
72
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
504 if self.end_re.match(line):
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
505 break
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
506 else:
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
507 match = self.prefix_re.match(line)
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
508 if match:
82
6db35db38795 Stop worrying about trailing newlines in get_filenames() overrides.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 81
diff changeset
509 yield line[match.end():]
72
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
510 self.archive.close()
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
511
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
512 def basename(self):
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
513 result = NoPipeExtractor.basename(self)
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
514 if result.endswith('.hdr'):
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
515 result = result[:-4]
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
516 return result
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
517
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
518
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
519 class BaseHandler(object):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
520 def __init__(self, extractor, options):
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
521 self.extractor = extractor
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
522 self.options = options
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
523 self.target = None
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
524
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
525 def handle(self):
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
526 command = 'find'
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
527 status = subprocess.call(['find', self.extractor.target, '-type', 'd',
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
528 '-exec', 'chmod', 'u+rwx', '{}', ';'])
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
529 if status == 0:
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
530 command = 'chmod'
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
531 status = subprocess.call(['chmod', '-R', 'u+rwX',
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
532 self.extractor.target])
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
533 if status != 0:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
534 return "%s returned with exit status %s" % (command, status)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
535 return self.organize()
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
536
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
537 def set_target(self, target, checker):
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
538 self.target = checker(target).check()
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
539 if self.target != target:
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
540 logger.warning("extracting %s to %s" %
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
541 (self.extractor.filename, self.target))
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
542
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
543
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
544 # The "where to extract" table, with options and archive types.
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
545 # This dictates the contents of each can_handle method.
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
546 #
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
547 # Flat Overwrite None
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
548 # File basename basename FilenameChecked
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
549 # Match . . tempdir + checked
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
550 # Bomb . basename DirectoryChecked
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
551
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
552 class FlatHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
553 def can_handle(contents, options):
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
554 return ((options.flat and (contents != ONE_ENTRY_KNOWN)) or
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
555 (options.overwrite and (contents == MATCHING_DIRECTORY)))
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
556 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
557
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
558 def organize(self):
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
559 self.target = '.'
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
560 for curdir, dirs, filenames in os.walk(self.extractor.target,
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
561 topdown=False):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
562 path_parts = curdir.split(os.sep)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
563 if path_parts[0] == '.':
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
564 del path_parts[1]
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
565 else:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
566 del path_parts[0]
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
567 newdir = os.path.join(*path_parts)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
568 if not os.path.isdir(newdir):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
569 os.makedirs(newdir)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
570 for filename in filenames:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
571 os.rename(os.path.join(curdir, filename),
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
572 os.path.join(newdir, filename))
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
573 os.rmdir(curdir)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
574
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
575
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
576 class OverwriteHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
577 def can_handle(contents, options):
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
578 return ((options.flat and (contents == ONE_ENTRY_KNOWN)) or
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
579 (options.overwrite and (contents != MATCHING_DIRECTORY)))
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
580 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
581
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
582 def organize(self):
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
583 self.target = self.extractor.basename()
51
f1789e6586d8 [svn] Don't try to rmtree when overwriting just a file.
brett
parents: 49
diff changeset
584 if os.path.isdir(self.target):
f1789e6586d8 [svn] Don't try to rmtree when overwriting just a file.
brett
parents: 49
diff changeset
585 shutil.rmtree(self.target)
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
586 os.rename(self.extractor.target, self.target)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
587
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
588
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
589 class MatchHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
590 def can_handle(contents, options):
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
591 return ((contents == MATCHING_DIRECTORY) or
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
592 ((contents in ONE_ENTRY_UNKNOWN) and
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
593 options.one_entry_policy.ok_for_match()))
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
594 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
595
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
596 def organize(self):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
597 source = os.path.join(self.extractor.target,
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
598 os.listdir(self.extractor.target)[0])
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
599 if os.path.isdir(source):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
600 checker = DirectoryChecker
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
601 else:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
602 checker = FilenameChecker
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
603 if self.options.one_entry_policy == EXTRACT_HERE:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
604 destination = self.extractor.content_name.rstrip('/')
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
605 else:
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
606 destination = self.extractor.basename()
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
607 self.set_target(destination, checker)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
608 if os.path.isdir(self.extractor.target):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
609 os.rename(source, self.target)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
610 os.rmdir(self.extractor.target)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
611 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
612 os.rename(self.extractor.target, self.target)
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
613 self.extractor.included_root = './'
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
614
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
615
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
616 class EmptyHandler(object):
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
617 def can_handle(contents, options):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
618 return contents == EMPTY
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
619 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
620
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
621 def __init__(self, extractor, options): pass
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
622 def handle(self): pass
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
623
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
624
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
625 class BombHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
626 def can_handle(contents, options):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
627 return True
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
628 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
629
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
630 def organize(self):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
631 basename = self.extractor.basename()
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
632 self.set_target(basename, self.extractor.name_checker)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
633 os.rename(self.extractor.target, self.target)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
634
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
635
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
636 class BasePolicy(object):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
637 def __init__(self, options):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
638 self.current_policy = None
26
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
639 if options.batch:
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
640 self.permanent_policy = self.answers['']
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
641 else:
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
642 self.permanent_policy = None
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
643
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
644 def ask_question(self, question):
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
645 question = question + self.choices
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
646 while True:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
647 print "\n".join(question)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
648 try:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
649 answer = raw_input(self.prompt)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
650 except EOFError:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
651 return self.answers['']
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
652 try:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
653 return self.answers[answer.lower()]
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
654 except KeyError:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
655 print
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
656
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
657 def __cmp__(self, other):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
658 return cmp(self.current_policy, other)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
659
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
660
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
661 class OneEntryPolicy(BasePolicy):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
662 answers = {'h': EXTRACT_HERE, 'i': EXTRACT_WRAP, 'r': EXTRACT_RENAME,
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
663 '': EXTRACT_WRAP}
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
664 choices = ["You can:",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
665 " * extract it Inside another directory",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
666 " * extract it and Rename the directory",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
667 " * extract it Here"]
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
668 prompt = "What do you want to do? (I/r/h) "
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
669
66
af0b822b012e Don't prompt for one entry handling with -f.
Brett Smith <brett@brettcsmith.org>
parents: 65
diff changeset
670 def __init__(self, options):
af0b822b012e Don't prompt for one entry handling with -f.
Brett Smith <brett@brettcsmith.org>
parents: 65
diff changeset
671 BasePolicy.__init__(self, options)
af0b822b012e Don't prompt for one entry handling with -f.
Brett Smith <brett@brettcsmith.org>
parents: 65
diff changeset
672 if options.flat:
84
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
673 default = 'h'
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
674 elif options.one_entry_default is not None:
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
675 default = options.one_entry_default.lower()
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
676 else:
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
677 return
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
678 if 'here'.startswith(default):
66
af0b822b012e Don't prompt for one entry handling with -f.
Brett Smith <brett@brettcsmith.org>
parents: 65
diff changeset
679 self.permanent_policy = EXTRACT_HERE
84
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
680 elif 'rename'.startswith(default):
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
681 self.permanent_policy = EXTRACT_RENAME
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
682 elif 'inside'.startswith(default):
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
683 self.permanent_policy = EXTRACT_WRAP
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
684 elif default is not None:
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
685 raise ValueError("bad value %s for default policy" % (default,))
66
af0b822b012e Don't prompt for one entry handling with -f.
Brett Smith <brett@brettcsmith.org>
parents: 65
diff changeset
686
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
687 def prep(self, archive_filename, extractor):
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
688 question = ["%s contains one %s, but it has a weird name." %
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
689 (archive_filename, extractor.content_type)]
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
690 question.append(" Expected: " + extractor.basename())
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
691 question.append(" Actual: " + extractor.content_name)
26
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
692 self.current_policy = (self.permanent_policy or
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
693 self.ask_question(question))
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
694
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
695 def ok_for_match(self):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
696 return self.current_policy in (EXTRACT_RENAME, EXTRACT_HERE)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
697
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
698
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
699 class RecursionPolicy(BasePolicy):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
700 answers = {'o': RECURSE_ONCE, 'a': RECURSE_ALWAYS, 'n': RECURSE_NOT_NOW,
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
701 'v': RECURSE_NEVER, 'l': RECURSE_LIST, '': RECURSE_NOT_NOW}
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
702 choices = ["You can:",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
703 " * Always extract included archives",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
704 " * extract included archives this Once",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
705 " * choose Not to extract included archives",
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
706 " * neVer extract included archives",
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
707 " * List included archives"]
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
708 prompt = "What do you want to do? (a/o/N/v/l) "
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
709
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
710 def __init__(self, options):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
711 BasePolicy.__init__(self, options)
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
712 if options.show_list:
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
713 self.permanent_policy = RECURSE_NEVER
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
714 elif options.recursive:
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
715 self.permanent_policy = RECURSE_ALWAYS
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
716
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
717 def prep(self, current_filename, target, extractor):
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
718 archive_count = len(extractor.included_archives)
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
719 if (self.permanent_policy is not None) or (archive_count == 0):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
720 self.current_policy = self.permanent_policy or RECURSE_NOT_NOW
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
721 return
69
35a2f45cdd3b Count files in the archive and report that in the recursion prompt.
Brett Smith <brett@brettcsmith.org>
parents: 66
diff changeset
722 question = (("%s contains %s other archive file(s), " +
70
48d2421a3178 Tweak wording of recursion question, and TODO.
Brett Smith <brett@brettcsmith.org>
parents: 69
diff changeset
723 "out of %s file(s) total.") %
69
35a2f45cdd3b Count files in the archive and report that in the recursion prompt.
Brett Smith <brett@brettcsmith.org>
parents: 66
diff changeset
724 (current_filename, archive_count, extractor.file_count))
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
725 question = textwrap.wrap(question)
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
726 if target == '.':
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
727 target = ''
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
728 included_root = extractor.included_root
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
729 if included_root == './':
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
730 included_root = ''
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
731 while True:
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
732 self.current_policy = self.ask_question(question)
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
733 if self.current_policy != RECURSE_LIST:
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
734 break
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
735 print ("\n%s\n" %
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
736 '\n'.join([os.path.join(target, included_root, filename)
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
737 for filename in extractor.included_archives]))
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
738 if self.current_policy in (RECURSE_ALWAYS, RECURSE_NEVER):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
739 self.permanent_policy = self.current_policy
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
740
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
741 def ok_to_recurse(self):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
742 return self.current_policy in (RECURSE_ALWAYS, RECURSE_ONCE)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
743
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
744
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
745 class ExtractorBuilder(object):
81
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
746 extractor_map = {'tar': {'extractor': TarExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
747 'mimetypes': ('x-tar',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
748 'extensions': ('tar',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
749 'magic': ('POSIX tar archive',)},
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
750 'zip': {'extractor': ZipExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
751 'mimetypes': ('zip',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
752 'extensions': ('zip',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
753 'magic': ('(Zip|ZIP self-extracting) archive',)},
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
754 'rpm': {'extractor': RPMExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
755 'mimetypes': ('x-redhat-package-manager', 'x-rpm'),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
756 'extensions': ('rpm',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
757 'magic': ('RPM',)},
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
758 'deb': {'extractor': DebExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
759 'metadata': DebMetadataExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
760 'mimetypes': ('x-debian-package',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
761 'extensions': ('deb',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
762 'magic': ('Debian binary package',)},
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
763 'cpio': {'extractor': CpioExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
764 'mimetypes': ('x-cpio',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
765 'extensions': ('cpio',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
766 'magic': ('cpio archive',)},
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
767 'gem': {'extractor': GemExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
768 'metadata': GemMetadataExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
769 'mimetypes': ('x-ruby-gem',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
770 'extensions': ('gem',)},
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
771 '7z': {'extractor': SevenExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
772 'mimetypes': ('x-7z-compressed',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
773 'extensions': ('7z',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
774 'magic': ('7-zip archive',)},
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
775 'cab': {'extractor': CABExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
776 'mimetypes': ('x-cab',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
777 'extensions': ('cab',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
778 'magic': ('Microsoft Cabinet Archive',)},
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
779 'shield': {'extractor': ShieldExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
780 'mimetypes': ('x-cab',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
781 'extensions': ('cab', 'hdr'),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
782 'magic': ('InstallShield CAB',)},
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
783 'compress': {'extractor': CompressionExtractor}
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
784 }
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
785
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
786 mimetype_map = {}
81
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
787 magic_mime_map = {}
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
788 extension_map = {}
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
789 for ext_name, ext_info in extractor_map.items():
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
790 for mimetype in ext_info.get('mimetypes', ()):
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
791 if '/' not in mimetype:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
792 mimetype = 'application/' + mimetype
81
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
793 mimetype_map[mimetype] = ext_name
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
794 for magic_re in ext_info.get('magic', ()):
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
795 magic_mime_map[re.compile(magic_re)] = ext_name
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
796 for extension in ext_info.get('extensions', ()):
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
797 extension_map.setdefault(extension, []).append((ext_name, None))
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
798
81
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
799 for mapping in (('tar', 'bzip2', 'tar.bz2'),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
800 ('tar', 'gzip', 'tar.gz', 'tgz'),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
801 ('compress', 'gzip', 'Z', 'gz'),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
802 ('compress', 'bzip2', 'bz2'),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
803 ('compress', 'lzma', 'lzma')):
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
804 for extension in mapping[2:]:
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
805 extension_map.setdefault(extension, []).append(mapping[:2])
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
806
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
807 magic_encoding_map = {}
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
808 for mapping in (('bzip2', 'bzip2 compressed'),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
809 ('gzip', 'gzip compressed')):
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
810 for pattern in mapping[1:]:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
811 magic_encoding_map[re.compile(pattern)] = mapping[0]
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
812
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
813 def __init__(self, filename, options):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
814 self.filename = filename
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
815 self.options = options
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
816
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
817 def build_extractor(self, archive_type, encoding):
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
818 extractors = self.extractor_map[archive_type]
81
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
819 if self.options.metadata and extractors.has_key('metadata'):
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
820 extractor = extractors['metadata']
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
821 else:
81
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
822 extractor = extractors['extractor']
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
823 return extractor(self.filename, encoding)
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
824
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
825 def get_extractor(self):
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
826 tried_types = set()
36
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
827 # As smart as it is, the magic test can't go first, because at least
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
828 # on my system it just recognizes gem files as tar files. I guess
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
829 # it's possible for the opposite problem to occur -- where the mimetype
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
830 # or extension suggests something less than ideal -- but it seems less
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
831 # likely so I'm sticking with this.
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
832 for func_name in ('mimetype', 'extension', 'magic'):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
833 logger.debug("getting extractors by %s" % (func_name,))
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
834 extractor_types = \
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
835 getattr(self, 'try_by_' + func_name)(self.filename)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
836 logger.debug("done getting extractors")
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
837 for ext_args in extractor_types:
36
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
838 if ext_args in tried_types:
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
839 continue
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
840 tried_types.add(ext_args)
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
841 logger.debug("trying %s extractor from %s" %
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
842 (ext_args, func_name))
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
843 yield self.build_extractor(*ext_args)
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
844
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
845 def try_by_mimetype(cls, filename):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
846 mimetype, encoding = mimetypes.guess_type(filename)
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
847 try:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
848 return [(cls.mimetype_map[mimetype], encoding)]
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
849 except KeyError:
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
850 if encoding:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
851 return [('compress', encoding)]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
852 return []
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
853 try_by_mimetype = classmethod(try_by_mimetype)
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
854
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
855 def magic_map_matches(cls, output, magic_map):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
856 return [result for regexp, result in magic_map.items()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
857 if regexp.search(output)]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
858 magic_map_matches = classmethod(magic_map_matches)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
859
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
860 def try_by_magic(cls, filename):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
861 process = subprocess.Popen(['file', '-z', filename],
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
862 stdout=subprocess.PIPE)
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
863 status = process.wait()
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
864 if status != 0:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
865 return []
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
866 output = process.stdout.readline()
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
867 process.stdout.close()
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
868 if output.startswith('%s: ' % filename):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
869 output = output[len(filename) + 2:]
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
870 mimes = cls.magic_map_matches(output, cls.magic_mime_map)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
871 encodings = cls.magic_map_matches(output, cls.magic_encoding_map)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
872 if mimes and not encodings:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
873 encodings = [None]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
874 elif encodings and not mimes:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
875 mimes = ['compress']
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
876 return [(m, e) for m in mimes for e in encodings]
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
877 try_by_magic = classmethod(try_by_magic)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
878
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
879 def try_by_extension(cls, filename):
43
4591a32eedc8 [svn] Sadly Python 2.3 does not have an rsplit method on strings.
brett
parents: 42
diff changeset
880 parts = filename.split('.')[-2:]
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
881 results = []
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
882 while parts:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
883 results.extend(cls.extension_map.get('.'.join(parts), []))
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
884 del parts[0]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
885 return results
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
886 try_by_extension = classmethod(try_by_extension)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
887
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
888
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
889 class BaseAction(object):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
890 def __init__(self, options, filenames):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
891 self.options = options
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
892 self.filenames = filenames
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
893 self.target = None
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
894
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
895 def report(self, function, *args):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
896 try:
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
897 error = function(*args)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
898 except EXTRACTION_ERRORS, exception:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
899 error = str(exception)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
900 logger.debug(''.join(traceback.format_exception(*sys.exc_info())))
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
901 return error
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
902
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
903
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
904 class ExtractionAction(BaseAction):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
905 handlers = [FlatHandler, OverwriteHandler, MatchHandler, EmptyHandler,
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
906 BombHandler]
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
907
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
908 def __init__(self, options, filenames):
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
909 BaseAction.__init__(self, options, filenames)
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
910 self.did_print = False
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
911
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
912 def get_handler(self, extractor):
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
913 if extractor.content_type in ONE_ENTRY_UNKNOWN:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
914 self.options.one_entry_policy.prep(self.current_filename,
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
915 extractor)
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
916 for handler in self.handlers:
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
917 if handler.can_handle(extractor.content_type, self.options):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
918 logger.debug("using %s handler" % (handler.__name__,))
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
919 self.current_handler = handler(extractor, self.options)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
920 break
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
921
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
922 def show_extraction(self, extractor):
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
923 if self.options.log_level > logging.INFO:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
924 return
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
925 elif self.did_print:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
926 print
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
927 else:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
928 self.did_print = True
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
929 print "%s:" % (self.current_filename,)
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
930 if extractor.contents is None:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
931 print self.current_handler.target
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
932 return
55
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
933 def reverser(x, y):
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
934 return cmp(y, x)
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
935 if self.current_handler.target == '.':
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
936 filenames = extractor.contents
55
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
937 filenames.sort(reverser)
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
938 else:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
939 filenames = [self.current_handler.target]
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
940 pathjoin = os.path.join
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
941 isdir = os.path.isdir
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
942 while filenames:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
943 filename = filenames.pop()
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
944 if isdir(filename):
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
945 print "%s/" % (filename,)
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
946 new_filenames = os.listdir(filename)
55
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
947 new_filenames.sort(reverser)
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
948 filenames.extend([pathjoin(filename, new_filename)
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
949 for new_filename in new_filenames])
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
950 else:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
951 print filename
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
952
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
953 def run(self, filename, extractor):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
954 self.current_filename = filename
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
955 error = (self.report(extractor.extract) or
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
956 self.report(self.get_handler, extractor) or
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
957 self.report(self.current_handler.handle) or
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
958 self.report(self.show_extraction, extractor))
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
959 if not error:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
960 self.target = self.current_handler.target
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
961 return error
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
962
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
963
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
964 class ListAction(BaseAction):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
965 def __init__(self, options, filenames):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
966 BaseAction.__init__(self, options, filenames)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
967 self.count = 0
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
968
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
969 def get_list(self, extractor):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
970 # Note: The reason I'm getting all the filenames up front is
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
971 # because if we run into trouble partway through the archive, we'll
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
972 # try another extractor. So before we display anything we have to
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
973 # be sure this one is successful. We maybe don't have to be quite
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
974 # this conservative but this is the easy way out for now.
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
975 self.filelist = list(extractor.get_filenames())
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
976
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
977 def show_list(self, filename):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
978 self.count += 1
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
979 if len(self.filenames) != 1:
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
980 if self.count > 1:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
981 print
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
982 print "%s:" % (filename,)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
983 print '\n'.join(self.filelist)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
984
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
985 def run(self, filename, extractor):
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
986 return (self.report(self.get_list, extractor) or
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
987 self.report(self.show_list, filename))
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
988
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
989
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
990 class ExtractorApplication(object):
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
991 def __init__(self, arguments):
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
992 for signal_num in (signal.SIGINT, signal.SIGTERM):
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
993 signal.signal(signal_num, self.abort)
85
ad73f75c9046 Use the default action for SIGPIPE.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 84
diff changeset
994 signal.signal(signal.SIGPIPE, signal.SIG_DFL)
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
995 self.parse_options(arguments)
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
996 self.setup_logger()
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
997 self.successes = []
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
998 self.failures = []
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
999
77
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1000 def clean_destination(self, dest_name):
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1001 try:
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1002 os.unlink(dest_name)
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1003 except OSError, error:
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1004 if error.errno == errno.EISDIR:
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1005 shutil.rmtree(dest_name, ignore_errors=True)
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1006
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1007 def abort(self, signal_num, frame):
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1008 signal.signal(signal_num, signal.SIG_IGN)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1009 print
49
c76dd2716113 [svn] Add a traceback when we catch a signal.
brett
parents: 48
diff changeset
1010 logger.debug("traceback:\n" +
c76dd2716113 [svn] Add a traceback when we catch a signal.
brett
parents: 48
diff changeset
1011 ''.join(traceback.format_stack(frame)).rstrip())
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1012 logger.debug("got signal %s; cleaning up" % (signal_num,))
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1013 clean_targets = set([os.path.realpath('.')])
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1014 if hasattr(self, 'current_directory'):
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1015 clean_targets.add(os.path.realpath(self.current_directory))
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1016 for directory in clean_targets:
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1017 os.chdir(directory)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1018 for path in glob.glob('.dtrx-*'):
77
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1019 self.clean_destination(path)
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1020 sys.exit(1)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1021
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1022 def parse_options(self, arguments):
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1023 parser = optparse.OptionParser(
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1024 usage="%prog [options] archive [archive2 ...]",
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1025 description="Intelligent archive extractor",
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1026 version=VERSION_BANNER
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1027 )
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1028 parser.add_option('-l', '-t', '--list', '--table', dest='show_list',
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1029 action='store_true', default=False,
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1030 help="list contents of archives on standard output")
84
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1031 parser.add_option('-m', '--metadata', dest='metadata',
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1032 action='store_true', default=False,
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1033 help="extract metadata from a .deb/.gem")
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1034 parser.add_option('-r', '--recursive', dest='recursive',
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1035 action='store_true', default=False,
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1036 help="extract archives contained in the ones listed")
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1037 parser.add_option('--one', '--one-entry', dest='one_entry_default',
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1038 default=None,
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1039 help=("specify extraction policy for one-entry " +
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1040 "archives: inside/rename/here"))
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
1041 parser.add_option('-n', '--noninteractive', dest='batch',
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
1042 action='store_true', default=False,
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
1043 help="don't ask how to handle special cases")
84
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1044 parser.add_option('-o', '--overwrite', dest='overwrite',
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1045 action='store_true', default=False,
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1046 help="overwrite any existing target output")
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1047 parser.add_option('-f', '--flat', '--no-directory', dest='flat',
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
1048 action='store_true', default=False,
84
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1049 help="extract everything to the current directory")
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1050 parser.add_option('-v', '--verbose', dest='verbose',
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1051 action='count', default=0,
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1052 help="be verbose/print debugging information")
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1053 parser.add_option('-q', '--quiet', dest='quiet',
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1054 action='count', default=3,
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1055 help="suppress warning/error messages")
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1056 self.options, filenames = parser.parse_args(arguments)
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1057 if not filenames:
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1058 parser.error("you did not list any archives")
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1059 # This makes WARNING is the default.
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1060 self.options.log_level = (10 * (self.options.quiet -
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1061 self.options.verbose))
84
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1062 try:
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1063 self.options.one_entry_policy = OneEntryPolicy(self.options)
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1064 except ValueError:
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1065 parser.error("invalid value for --one-entry option")
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
1066 self.options.recursion_policy = RecursionPolicy(self.options)
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1067 self.archives = {os.path.realpath(os.curdir): filenames}
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1068
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
1069 def setup_logger(self):
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1070 logging.getLogger().setLevel(self.options.log_level)
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
1071 handler = logging.StreamHandler()
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1072 handler.setLevel(self.options.log_level)
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1073 formatter = logging.Formatter("dtrx: %(levelname)s: %(message)s")
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
1074 handler.setFormatter(formatter)
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1075 logger.addHandler(handler)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
1076 logger.debug("logger is set up")
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1077
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1078 def recurse(self, filename, extractor, action):
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
1079 self.options.recursion_policy.prep(filename, action.target, extractor)
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
1080 if self.options.recursion_policy.ok_to_recurse():
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
1081 for filename in extractor.included_archives:
71
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1082 logger.debug("recursing with %s archive" %
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1083 (extractor.content_type,))
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
1084 tail_path, basename = os.path.split(filename)
71
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1085 path_args = [self.current_directory, extractor.included_root,
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1086 tail_path]
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1087 logger.debug("included root: %s" % (extractor.included_root,))
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1088 logger.debug("tail path: %s" % (tail_path,))
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1089 if os.path.isdir(action.target):
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1090 logger.debug("action target: %s" % (action.target,))
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1091 path_args.insert(1, action.target)
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1092 directory = os.path.join(*path_args)
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
1093 self.archives.setdefault(directory, []).append(basename)
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
1094
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1095 def check_file(self, filename):
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1096 try:
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1097 result = os.stat(filename)
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1098 except OSError, error:
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1099 return error.strerror
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1100 if stat.S_ISDIR(result.st_mode):
79
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
1101 return "cannot work with a directory"
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1102
78
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
1103 def show_stderr(self, logger_func, stderr):
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
1104 if stderr:
79
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
1105 logger_func("Error output from this process:\n" +
78
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
1106 stderr.rstrip('\n'))
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
1107
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1108 def try_extractors(self, filename, builder):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1109 errors = []
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1110 for extractor in builder:
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1111 error = self.action.run(filename, extractor)
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1112 if error:
78
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
1113 errors.append((extractor.file_type, extractor.encoding, error,
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
1114 extractor.get_stderr()))
77
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1115 if extractor.target is not None:
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1116 self.clean_destination(extractor.target)
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1117 else:
78
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
1118 self.show_stderr(logger.warn, extractor.get_stderr())
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1119 self.recurse(filename, extractor, self.action)
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1120 return
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1121 logger.error("could not handle %s" % (filename,))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1122 if not errors:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1123 logger.error("not a known archive type")
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1124 return True
78
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
1125 for file_type, encoding, error, stderr in errors:
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1126 message = ["treating as", file_type, "failed:", error]
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1127 if encoding:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1128 message.insert(1, "%s-encoded" % (encoding,))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1129 logger.error(' '.join(message))
78
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
1130 self.show_stderr(logger.error, stderr)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1131 return True
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1132
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1133 def run(self):
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1134 if self.options.show_list:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1135 action = ListAction
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1136 else:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1137 action = ExtractionAction
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1138 self.action = action(self.options, self.archives.values()[0])
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1139 while self.archives:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1140 self.current_directory, self.filenames = self.archives.popitem()
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1141 os.chdir(self.current_directory)
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1142 for filename in self.filenames:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1143 builder = ExtractorBuilder(filename, self.options)
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1144 error = (self.check_file(filename) or
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1145 self.try_extractors(filename, builder.get_extractor()))
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1146 if error:
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1147 if error != True:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1148 logger.error("%s: %s" % (filename, error))
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1149 self.failures.append(filename)
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1150 else:
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1151 self.successes.append(filename)
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1152 self.options.one_entry_policy.permanent_policy = EXTRACT_WRAP
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1153 if self.failures:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1154 return 1
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1155 return 0
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1156
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1157
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1158 if __name__ == '__main__':
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1159 app = ExtractorApplication(sys.argv[1:])
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1160 sys.exit(app.run())

mercurial