scripts/dtrx

Thu, 10 Jul 2008 19:51:53 -0400

author
Brett Smith <brett@brettcsmith.org>
date
Thu, 10 Jul 2008 19:51:53 -0400
branch
trunk
changeset 69
35a2f45cdd3b
parent 66
af0b822b012e
child 70
48d2421a3178
permissions
-rwxr-xr-x

Count files in the archive and report that in the recursion prompt.

I like this because, for example, if you see that all or most of the files
in the archive are recursive, you can go ahead and decide to recurse right
away.

This invoved making the grep tests a little smarter about handling white
space.

1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1 #!/usr/bin/env python
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
2 #
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
3 # dtrx -- Intelligently extract various archive types.
54
cd43d2f61162 [svn] Update copyright dates in the license headers.
brett
parents: 53
diff changeset
4 # Copyright (c) 2006, 2007, 2008 Brett Smith <brettcsmith@brettcsmith.org>.
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
5 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
6 # This program is free software; you can redistribute it and/or modify it
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
7 # under the terms of the GNU General Public License as published by the
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
8 # Free Software Foundation; either version 3 of the License, or (at your
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
9 # option) any later version.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
10 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
11 # This program is distributed in the hope that it will be useful, but
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
12 # WITHOUT ANY WARRANTY; without even the implied warranty of
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
13 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
14 # Public License for more details.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
15 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
16 # You should have received a copy of the GNU General Public License along
42
4a4cab75d5e6 [svn] Update documentation.
brett
parents: 41
diff changeset
17 # with this program; if not, see <http://www.gnu.org/licenses/>.
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
18
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
19 import errno
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
20 import glob
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
21 import logging
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
22 import mimetypes
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
23 import optparse
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
24 import os
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
25 import re
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
26 import shutil
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
27 import signal
15
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
28 import stat
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
29 import subprocess
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
30 import sys
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
31 import tempfile
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
32 import textwrap
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
33 import traceback
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
34
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
35 from sets import Set as set
36
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
36
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
37 VERSION = "6.0"
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
38 VERSION_BANNER = """dtrx version %s
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
39 Copyright (c) 2006, 2007, 2008 Brett Smith <brettcsmith@brettcsmith.org>
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
40
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
41 This program is free software; you can redistribute it and/or modify it
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
42 under the terms of the GNU General Public License as published by the
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
43 Free Software Foundation; either version 3 of the License, or (at your
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
44 option) any later version.
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
45
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
46 This program is distributed in the hope that it will be useful, but
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
47 WITHOUT ANY WARRANTY; without even the implied warranty of
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
48 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
49 Public License for more details.""" % (VERSION,)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
50
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
51 MATCHING_DIRECTORY = 1
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
52 ONE_ENTRY_KNOWN = 2
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
53 BOMB = 3
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
54 EMPTY = 4
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
55 ONE_ENTRY_FILE = 'file'
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
56 ONE_ENTRY_DIRECTORY = 'directory'
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
57
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
58 ONE_ENTRY_UNKNOWN = [ONE_ENTRY_FILE, ONE_ENTRY_DIRECTORY]
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
59
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
60 EXTRACT_HERE = 1
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
61 EXTRACT_WRAP = 2
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
62 EXTRACT_RENAME = 3
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
63
23
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
64 RECURSE_ALWAYS = 1
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
65 RECURSE_ONCE = 2
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
66 RECURSE_NOT_NOW = 3
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
67 RECURSE_NEVER = 4
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
68 RECURSE_LIST = 5
23
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
69
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
70 mimetypes.encodings_map.setdefault('.bz2', 'bzip2')
34
a8f875e02c83 [svn] Add support for LZMA compression. Holy crap that was easy.
brett
parents: 33
diff changeset
71 mimetypes.encodings_map.setdefault('.lzma', 'lzma')
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
72 mimetypes.types_map.setdefault('.gem', 'application/x-ruby-gem')
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
73
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
74 logger = logging.getLogger('dtrx-log')
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
75
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
76 class FilenameChecker(object):
58
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
77 free_func = os.open
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
78 free_args = (os.O_CREAT | os.O_EXCL,)
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
79 free_close = os.close
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
80
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
81 def __init__(self, original_name):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
82 self.original_name = original_name
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
83
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
84 def is_free(self, filename):
58
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
85 try:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
86 result = self.free_func(filename, *self.free_args)
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
87 except OSError, error:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
88 if error.errno == errno.EEXIST:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
89 return False
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
90 raise
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
91 if self.free_close:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
92 self.free_close(result)
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
93 return True
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
94
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
95 def create(self):
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
96 fd, filename = tempfile.mkstemp(prefix=self.original_name + '.',
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
97 dir='.')
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
98 os.close(fd)
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
99 return filename
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
100
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
101 def check(self):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
102 for suffix in [''] + ['.%s' % (x,) for x in range(1, 10)]:
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
103 filename = '%s%s' % (self.original_name, suffix)
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
104 if self.is_free(filename):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
105 return filename
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
106 return self.create()
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
107
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
108
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
109 class DirectoryChecker(FilenameChecker):
58
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
110 free_func = os.mkdir
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
111 free_args = ()
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
112 free_close = None
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
113
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
114 def create(self):
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
115 return tempfile.mkdtemp(prefix=self.original_name + '.', dir='.')
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
116
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
117
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
118 class ExtractorError(Exception):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
119 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
120
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
121
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
122 class ExtractorUnusable(Exception):
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
123 pass
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
124
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
125
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
126 EXTRACTION_ERRORS = (ExtractorError, ExtractorUnusable, OSError, IOError)
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
127
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
128 class BaseExtractor(object):
34
a8f875e02c83 [svn] Add support for LZMA compression. Holy crap that was easy.
brett
parents: 33
diff changeset
129 decoders = {'bzip2': 'bzcat', 'gzip': 'zcat', 'compress': 'zcat',
a8f875e02c83 [svn] Add support for LZMA compression. Holy crap that was easy.
brett
parents: 33
diff changeset
130 'lzma': 'lzcat'}
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
131
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
132 name_checker = DirectoryChecker
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
133
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
134 def __init__(self, filename, encoding):
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
135 if encoding and (not self.decoders.has_key(encoding)):
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
136 raise ValueError("unrecognized encoding %s" % (encoding,))
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
137 self.filename = os.path.realpath(filename)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
138 self.encoding = encoding
69
35a2f45cdd3b Count files in the archive and report that in the recursion prompt.
Brett Smith <brett@brettcsmith.org>
parents: 66
diff changeset
139 self.file_count = 0
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
140 self.included_archives = []
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
141 self.target = None
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
142 self.content_type = None
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
143 self.content_name = None
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
144 self.pipes = []
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
145 self.stderr = tempfile.TemporaryFile()
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
146 self.exit_codes = []
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
147 try:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
148 self.archive = open(filename, 'r')
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
149 except (IOError, OSError), error:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
150 raise ExtractorError("could not open %s: %s" %
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
151 (filename, error.strerror))
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
152 if encoding:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
153 self.pipe([self.decoders[encoding]], "decoding")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
154 self.prepare()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
155
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
156 def pipe(self, command, description="extraction"):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
157 self.pipes.append((command, description))
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
158
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
159 def first_bad_exit_code(self):
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
160 for index, code in enumerate(self.exit_codes):
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
161 if code != 0:
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
162 return index
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
163 return None
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
164
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
165 def run_pipes(self, final_stdout=None):
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
166 if not self.pipes:
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
167 return
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
168 elif final_stdout is None:
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
169 # FIXME: Buffering this might be dumb.
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
170 final_stdout = tempfile.TemporaryFile()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
171 num_pipes = len(self.pipes)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
172 last_pipe = num_pipes - 1
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
173 processes = []
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
174 for index, command in enumerate([pipe[0] for pipe in self.pipes]):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
175 if index == 0:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
176 stdin = self.archive
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
177 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
178 stdin = processes[-1].stdout
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
179 if index == last_pipe:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
180 stdout = final_stdout
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
181 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
182 stdout = subprocess.PIPE
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
183 try:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
184 processes.append(subprocess.Popen(command, stdin=stdin,
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
185 stdout=stdout,
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
186 stderr=self.stderr))
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
187 except OSError, error:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
188 if error.errno == errno.ENOENT:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
189 raise ExtractorUnusable("could not run %s" % (command[0],))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
190 raise
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
191 self.exit_codes = [pipe.wait() for pipe in processes]
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
192 self.archive.close()
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
193 for index in range(last_pipe):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
194 processes[index].stdout.close()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
195 self.archive = final_stdout
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
196
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
197 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
198 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
199
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
200 def check_included_archives(self):
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
201 if (self.content_name is None) or (not self.content_name.endswith('/')):
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
202 self.included_root = './'
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
203 else:
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
204 self.included_root = self.content_name
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
205 start_index = len(self.included_root)
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
206 for path, dirname, filenames in os.walk(self.included_root):
69
35a2f45cdd3b Count files in the archive and report that in the recursion prompt.
Brett Smith <brett@brettcsmith.org>
parents: 66
diff changeset
207 self.file_count += len(filenames)
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
208 path = path[start_index:]
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
209 for filename in filenames:
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
210 if (ExtractorBuilder.try_by_mimetype(filename) or
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
211 ExtractorBuilder.try_by_extension(filename)):
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
212 self.included_archives.append(os.path.join(path, filename))
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
213
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
214 def check_contents(self):
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
215 self.contents = os.listdir('.')
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
216 if not self.contents:
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
217 self.content_type = EMPTY
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
218 elif len(self.contents) == 1:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
219 if self.basename() == self.contents[0]:
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
220 self.content_type = MATCHING_DIRECTORY
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
221 elif os.path.isdir(self.contents[0]):
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
222 self.content_type = ONE_ENTRY_DIRECTORY
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
223 else:
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
224 self.content_type = ONE_ENTRY_FILE
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
225 self.content_name = self.contents[0]
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
226 if os.path.isdir(self.contents[0]):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
227 self.content_name += '/'
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
228 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
229 self.content_type = BOMB
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
230 self.check_included_archives()
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
231
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
232 def basename(self):
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
233 pieces = os.path.basename(self.filename).split('.')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
234 extension = '.' + pieces[-1]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
235 if mimetypes.encodings_map.has_key(extension):
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
236 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
237 extension = '.' + pieces[-1]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
238 if (mimetypes.types_map.has_key(extension) or
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
239 mimetypes.common_types.has_key(extension) or
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
240 mimetypes.suffix_map.has_key(extension)):
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
241 pieces.pop()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
242 return '.'.join(pieces)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
243
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
244 def check_success(self, got_output):
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
245 self.stderr.seek(0, 0)
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
246 if self.stderr.read(1):
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
247 self.stderr.seek(0, 0)
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
248 logger.warning(self.stderr.read(-1))
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
249 self.stderr.close()
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
250 error_index = self.first_bad_exit_code()
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
251 if (not got_output) and (error_index is not None):
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
252 command = ' '.join(self.pipes[error_index][0])
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
253 raise ExtractorError("%s error: '%s' returned status code %s" %
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
254 (self.pipes[error_index][1], command,
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
255 self.exit_codes[error_index]))
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
256
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
257 def extract(self):
40
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
258 try:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
259 self.target = tempfile.mkdtemp(prefix='.dtrx-', dir='.')
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
260 except (OSError, IOError), error:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
261 raise ExtractorError("cannot extract here: %s" % (error.strerror,))
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
262 old_path = os.path.realpath(os.curdir)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
263 os.chdir(self.target)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
264 try:
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
265 self.archive.seek(0, 0)
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
266 self.extract_archive()
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
267 self.check_contents()
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
268 self.check_success(self.content_type != EMPTY)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
269 except EXTRACTION_ERRORS:
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
270 self.archive.close()
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
271 os.chdir(old_path)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
272 shutil.rmtree(self.target, ignore_errors=True)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
273 raise
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
274 self.archive.close()
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
275 os.chdir(old_path)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
276
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
277 def get_filenames(self):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
278 self.run_pipes()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
279 self.archive.seek(0, 0)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
280 while True:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
281 line = self.archive.readline()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
282 if not line:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
283 self.archive.close()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
284 return
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
285 yield line.rstrip('\n')
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
286
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
287
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
288 class CompressionExtractor(BaseExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
289 file_type = 'compressed file'
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
290 name_checker = FilenameChecker
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
291
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
292 def basename(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
293 pieces = os.path.basename(self.filename).split('.')
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
294 extension = '.' + pieces[-1]
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
295 if mimetypes.encodings_map.has_key(extension):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
296 pieces.pop()
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
297 return '.'.join(pieces)
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
298
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
299 def get_filenames(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
300 yield self.basename()
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
301
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
302 def extract(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
303 self.content_type = ONE_ENTRY_KNOWN
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
304 self.content_name = self.basename()
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
305 self.contents = None
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
306 self.included_root = './'
40
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
307 try:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
308 output_fd, self.target = tempfile.mkstemp(prefix='.dtrx-', dir='.')
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
309 except (OSError, IOError), error:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
310 raise ExtractorError("cannot extract here: %s" % (error.strerror,))
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
311 self.run_pipes(output_fd)
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
312 os.close(output_fd)
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
313 try:
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
314 self.check_success(os.stat(self.target)[stat.ST_SIZE] > 0)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
315 except EXTRACTION_ERRORS:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
316 os.unlink(self.target)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
317 raise
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
318
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
319 class TarExtractor(BaseExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
320 file_type = 'tar file'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
321
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
322 def get_filenames(self):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
323 self.pipe(['tar', '-t'], "listing")
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
324 return BaseExtractor.get_filenames(self)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
325
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
326 def extract_archive(self):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
327 self.pipe(['tar', '-x'])
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
328 self.run_pipes()
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
329
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
330
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
331 class CpioExtractor(BaseExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
332 file_type = 'cpio file'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
333
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
334 def get_filenames(self):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
335 self.pipe(['cpio', '-t'], "listing")
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
336 return BaseExtractor.get_filenames(self)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
337
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
338 def extract_archive(self):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
339 self.pipe(['cpio', '-i', '--make-directories',
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
340 '--no-absolute-filenames'])
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
341 self.run_pipes()
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
342
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
343
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
344 class RPMExtractor(CpioExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
345 file_type = 'RPM'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
346
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
347 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
348 self.pipe(['rpm2cpio', '-'], "rpm2cpio")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
349
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
350 def basename(self):
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
351 pieces = os.path.basename(self.filename).split('.')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
352 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
353 return pieces[0]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
354 elif pieces[-1] != 'rpm':
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
355 return BaseExtractor.basename(self)
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
356 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
357 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
358 return pieces[0]
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
359 elif len(pieces[-1]) < 8:
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
360 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
361 return '.'.join(pieces)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
362
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
363 def check_contents(self):
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
364 self.check_included_archives()
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
365 self.content_type = BOMB
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
366
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
367
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
368 class DebExtractor(TarExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
369 file_type = 'Debian package'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
370
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
371 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
372 self.pipe(['ar', 'p', self.filename, 'data.tar.gz'],
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
373 "data.tar.gz extraction")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
374 self.pipe(['zcat'], "data.tar.gz decompression")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
375
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
376 def basename(self):
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
377 pieces = os.path.basename(self.filename).split('_')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
378 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
379 return pieces[0]
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
380 last_piece = pieces.pop()
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
381 if (len(last_piece) > 10) or (not last_piece.endswith('.deb')):
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
382 return BaseExtractor.basename(self)
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
383 return '_'.join(pieces)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
384
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
385 def check_contents(self):
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
386 self.check_included_archives()
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
387 self.content_type = BOMB
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
388
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
389
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
390 class DebMetadataExtractor(DebExtractor):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
391 def prepare(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
392 self.pipe(['ar', 'p', self.filename, 'control.tar.gz'],
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
393 "control.tar.gz extraction")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
394 self.pipe(['zcat'], "control.tar.gz decompression")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
395
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
396
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
397 class GemExtractor(TarExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
398 file_type = 'Ruby gem'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
399
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
400 def prepare(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
401 self.pipe(['tar', '-xO', 'data.tar.gz'], "data.tar.gz extraction")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
402 self.pipe(['zcat'], "data.tar.gz decompression")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
403
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
404 def check_contents(self):
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
405 self.check_included_archives()
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
406 self.content_type = BOMB
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
407
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
408
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
409 class GemMetadataExtractor(CompressionExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
410 file_type = 'Ruby gem'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
411
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
412 def prepare(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
413 self.pipe(['tar', '-xO', 'metadata.gz'], "metadata.gz extraction")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
414 self.pipe(['zcat'], "metadata.gz decompression")
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
415
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
416 def basename(self):
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
417 return os.path.basename(self.filename) + '-metadata.txt'
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
418
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
419
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
420 class NoPipeExtractor(BaseExtractor):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
421 # Some extraction tools won't accept the archive from stdin. With
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
422 # these, the piping infrastructure we normally set up generally doesn't
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
423 # work, at least at first. We can still use most of it; we just don't
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
424 # want to seed self.archive with the archive file, since that sucks up
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
425 # memory. So instead we seed it with /dev/null, and specify the
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
426 # filename on the command line as necessary. We also open the actual
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
427 # file with os.open, to make sure we can actually do it (permissions
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
428 # are good, etc.). This class doesn't do anything by itself; it's just
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
429 # meant to be a base class for extractors that rely on these dumb
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
430 # tools.
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
431 def __init__(self, filename, encoding):
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
432 os.close(os.open(filename, os.O_RDONLY))
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
433 BaseExtractor.__init__(self, '/dev/null', None)
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
434 self.filename = os.path.realpath(filename)
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
435
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
436
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
437 class ZipExtractor(NoPipeExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
438 file_type = 'Zip file'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
439
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
440 def get_filenames(self):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
441 self.pipe(['zipinfo', '-1', self.filename], "listing")
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
442 return BaseExtractor.get_filenames(self)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
443
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
444 def extract_archive(self):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
445 self.pipe(['unzip', '-q', self.filename])
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
446 self.run_pipes()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
447
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
448
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
449 class SevenExtractor(NoPipeExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
450 file_type = '7z file'
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
451 border_re = re.compile('^[- ]+$')
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
452
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
453 def get_filenames(self):
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
454 self.pipe(['7z', 'l', self.filename], "listing")
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
455 self.run_pipes()
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
456 self.archive.seek(0, 0)
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
457 fn_index = None
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
458 for line in self.archive:
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
459 if self.border_re.match(line):
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
460 if fn_index is not None:
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
461 break
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
462 else:
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
463 fn_index = line.rindex(' ') + 1
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
464 elif fn_index is not None:
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
465 yield line[fn_index:-1]
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
466 self.archive.close()
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
467
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
468 def extract_archive(self):
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
469 self.pipe(['7z', 'x', self.filename])
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
470 self.run_pipes()
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
471
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
472
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
473 class CABExtractor(NoPipeExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
474 file_type = 'CAB archive'
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
475 border_re = re.compile(r'^[-\+]+$')
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
476
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
477 def get_filenames(self):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
478 self.pipe(['cabextract', '-l', self.filename], "listing")
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
479 self.run_pipes()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
480 self.archive.seek(0, 0)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
481 fn_index = None
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
482 for line in self.archive:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
483 if self.border_re.match(line):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
484 break
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
485 for line in self.archive:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
486 try:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
487 yield line.split(' | ', 2)[2].rstrip('\n')
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
488 except IndexError:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
489 break
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
490 self.archive.close()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
491
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
492 def extract_archive(self):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
493 self.pipe(['cabextract', '-q', self.filename])
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
494 self.run_pipes()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
495
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
496
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
497 class BaseHandler(object):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
498 def __init__(self, extractor, options):
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
499 self.extractor = extractor
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
500 self.options = options
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
501 self.target = None
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
502
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
503 def handle(self):
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
504 command = 'find'
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
505 status = subprocess.call(['find', self.extractor.target, '-type', 'd',
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
506 '-exec', 'chmod', 'u+rwx', '{}', ';'])
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
507 if status == 0:
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
508 command = 'chmod'
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
509 status = subprocess.call(['chmod', '-R', 'u+rwX',
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
510 self.extractor.target])
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
511 if status != 0:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
512 return "%s returned with exit status %s" % (command, status)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
513 return self.organize()
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
514
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
515 def set_target(self, target, checker):
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
516 self.target = checker(target).check()
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
517 if self.target != target:
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
518 logger.warning("extracting %s to %s" %
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
519 (self.extractor.filename, self.target))
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
520
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
521
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
522 # The "where to extract" table, with options and archive types.
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
523 # This dictates the contents of each can_handle method.
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
524 #
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
525 # Flat Overwrite None
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
526 # File basename basename FilenameChecked
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
527 # Match . . tempdir + checked
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
528 # Bomb . basename DirectoryChecked
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
529
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
530 class FlatHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
531 def can_handle(contents, options):
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
532 return ((options.flat and (contents != ONE_ENTRY_KNOWN)) or
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
533 (options.overwrite and (contents == MATCHING_DIRECTORY)))
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
534 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
535
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
536 def organize(self):
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
537 self.target = '.'
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
538 for curdir, dirs, filenames in os.walk(self.extractor.target,
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
539 topdown=False):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
540 path_parts = curdir.split(os.sep)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
541 if path_parts[0] == '.':
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
542 del path_parts[1]
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
543 else:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
544 del path_parts[0]
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
545 newdir = os.path.join(*path_parts)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
546 if not os.path.isdir(newdir):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
547 os.makedirs(newdir)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
548 for filename in filenames:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
549 os.rename(os.path.join(curdir, filename),
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
550 os.path.join(newdir, filename))
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
551 os.rmdir(curdir)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
552
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
553
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
554 class OverwriteHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
555 def can_handle(contents, options):
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
556 return ((options.flat and (contents == ONE_ENTRY_KNOWN)) or
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
557 (options.overwrite and (contents != MATCHING_DIRECTORY)))
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
558 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
559
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
560 def organize(self):
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
561 self.target = self.extractor.basename()
51
f1789e6586d8 [svn] Don't try to rmtree when overwriting just a file.
brett
parents: 49
diff changeset
562 if os.path.isdir(self.target):
f1789e6586d8 [svn] Don't try to rmtree when overwriting just a file.
brett
parents: 49
diff changeset
563 shutil.rmtree(self.target)
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
564 os.rename(self.extractor.target, self.target)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
565
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
566
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
567 class MatchHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
568 def can_handle(contents, options):
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
569 return ((contents == MATCHING_DIRECTORY) or
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
570 ((contents in ONE_ENTRY_UNKNOWN) and
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
571 options.one_entry_policy.ok_for_match()))
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
572 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
573
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
574 def organize(self):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
575 source = os.path.join(self.extractor.target,
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
576 os.listdir(self.extractor.target)[0])
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
577 if os.path.isdir(source):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
578 checker = DirectoryChecker
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
579 else:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
580 checker = FilenameChecker
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
581 if self.options.one_entry_policy == EXTRACT_HERE:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
582 destination = self.extractor.content_name.rstrip('/')
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
583 else:
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
584 destination = self.extractor.basename()
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
585 self.set_target(destination, checker)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
586 if os.path.isdir(self.extractor.target):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
587 os.rename(source, self.target)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
588 os.rmdir(self.extractor.target)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
589 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
590 os.rename(self.extractor.target, self.target)
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
591 self.extractor.included_root = './'
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
592
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
593
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
594 class EmptyHandler(object):
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
595 def can_handle(contents, options):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
596 return contents == EMPTY
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
597 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
598
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
599 def __init__(self, extractor, options): pass
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
600 def handle(self): pass
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
601
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
602
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
603 class BombHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
604 def can_handle(contents, options):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
605 return True
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
606 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
607
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
608 def organize(self):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
609 basename = self.extractor.basename()
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
610 self.set_target(basename, self.extractor.name_checker)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
611 os.rename(self.extractor.target, self.target)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
612
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
613
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
614 class BasePolicy(object):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
615 def __init__(self, options):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
616 self.current_policy = None
26
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
617 if options.batch:
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
618 self.permanent_policy = self.answers['']
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
619 else:
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
620 self.permanent_policy = None
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
621
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
622 def ask_question(self, question):
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
623 question = question + self.choices
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
624 while True:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
625 print "\n".join(question)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
626 try:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
627 answer = raw_input(self.prompt)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
628 except EOFError:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
629 return self.answers['']
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
630 try:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
631 return self.answers[answer.lower()]
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
632 except KeyError:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
633 print
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
634
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
635 def __cmp__(self, other):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
636 return cmp(self.current_policy, other)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
637
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
638
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
639 class OneEntryPolicy(BasePolicy):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
640 answers = {'h': EXTRACT_HERE, 'i': EXTRACT_WRAP, 'r': EXTRACT_RENAME,
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
641 '': EXTRACT_WRAP}
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
642 choices = ["You can:",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
643 " * extract it Inside another directory",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
644 " * extract it and Rename the directory",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
645 " * extract it Here"]
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
646 prompt = "What do you want to do? (I/r/h) "
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
647
66
af0b822b012e Don't prompt for one entry handling with -f.
Brett Smith <brett@brettcsmith.org>
parents: 65
diff changeset
648 def __init__(self, options):
af0b822b012e Don't prompt for one entry handling with -f.
Brett Smith <brett@brettcsmith.org>
parents: 65
diff changeset
649 BasePolicy.__init__(self, options)
af0b822b012e Don't prompt for one entry handling with -f.
Brett Smith <brett@brettcsmith.org>
parents: 65
diff changeset
650 if options.flat:
af0b822b012e Don't prompt for one entry handling with -f.
Brett Smith <brett@brettcsmith.org>
parents: 65
diff changeset
651 self.permanent_policy = EXTRACT_HERE
af0b822b012e Don't prompt for one entry handling with -f.
Brett Smith <brett@brettcsmith.org>
parents: 65
diff changeset
652
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
653 def prep(self, archive_filename, extractor):
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
654 question = ["%s contains one %s, but it has a weird name." %
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
655 (archive_filename, extractor.content_type)]
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
656 question.append(" Expected: " + extractor.basename())
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
657 question.append(" Actual: " + extractor.content_name)
26
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
658 self.current_policy = (self.permanent_policy or
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
659 self.ask_question(question))
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
660
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
661 def ok_for_match(self):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
662 return self.current_policy in (EXTRACT_RENAME, EXTRACT_HERE)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
663
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
664
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
665 class RecursionPolicy(BasePolicy):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
666 answers = {'o': RECURSE_ONCE, 'a': RECURSE_ALWAYS, 'n': RECURSE_NOT_NOW,
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
667 'v': RECURSE_NEVER, 'l': RECURSE_LIST, '': RECURSE_NOT_NOW}
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
668 choices = ["You can:",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
669 " * Always extract included archives",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
670 " * extract included archives this Once",
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
671 " * choose Not to extract included archives",
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
672 " * neVer extract included archives",
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
673 " * List included archives"]
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
674 prompt = "What do you want to do? (a/o/N/v/l) "
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
675
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
676 def __init__(self, options):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
677 BasePolicy.__init__(self, options)
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
678 if options.show_list:
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
679 self.permanent_policy = RECURSE_NEVER
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
680 elif options.recursive:
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
681 self.permanent_policy = RECURSE_ALWAYS
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
682
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
683 def prep(self, current_filename, target, extractor):
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
684 archive_count = len(extractor.included_archives)
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
685 if (self.permanent_policy is not None) or (archive_count == 0):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
686 self.current_policy = self.permanent_policy or RECURSE_NOT_NOW
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
687 return
69
35a2f45cdd3b Count files in the archive and report that in the recursion prompt.
Brett Smith <brett@brettcsmith.org>
parents: 66
diff changeset
688 question = (("%s contains %s other archive file(s), " +
35a2f45cdd3b Count files in the archive and report that in the recursion prompt.
Brett Smith <brett@brettcsmith.org>
parents: 66
diff changeset
689 "out of %s files total.") %
35a2f45cdd3b Count files in the archive and report that in the recursion prompt.
Brett Smith <brett@brettcsmith.org>
parents: 66
diff changeset
690 (current_filename, archive_count, extractor.file_count))
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
691 question = textwrap.wrap(question)
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
692 if target == '.':
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
693 target = ''
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
694 included_root = extractor.included_root
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
695 if included_root == './':
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
696 included_root = ''
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
697 while True:
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
698 self.current_policy = self.ask_question(question)
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
699 if self.current_policy != RECURSE_LIST:
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
700 break
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
701 print ("\n%s\n" %
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
702 '\n'.join([os.path.join(target, included_root, filename)
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
703 for filename in extractor.included_archives]))
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
704 if self.current_policy in (RECURSE_ALWAYS, RECURSE_NEVER):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
705 self.permanent_policy = self.current_policy
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
706
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
707 def ok_to_recurse(self):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
708 return self.current_policy in (RECURSE_ALWAYS, RECURSE_ONCE)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
709
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
710
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
711 class ExtractorBuilder(object):
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
712 extractor_map = {'tar': (TarExtractor, None),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
713 'zip': (ZipExtractor, None),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
714 'deb': (DebExtractor, DebMetadataExtractor),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
715 'rpm': (RPMExtractor, None),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
716 'cpio': (CpioExtractor, None),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
717 'gem': (GemExtractor, GemMetadataExtractor),
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
718 'compress': (CompressionExtractor, None),
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
719 '7z': (SevenExtractor, None),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
720 'cab': (CABExtractor, None)}
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
721
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
722 mimetype_map = {}
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
723 for mapping in (('tar', 'x-tar'),
59
7a0aafe2fe87 [svn] Find self-extracting archives by their file magic only, not extension/mimetype.
brett
parents: 58
diff changeset
724 ('zip', 'zip'),
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
725 ('deb', 'x-debian-package'),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
726 ('rpm', 'x-redhat-package-manager', 'x-rpm'),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
727 ('cpio', 'x-cpio'),
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
728 ('gem', 'x-ruby-gem'),
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
729 ('7z', 'x-7z-compressed'),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
730 ('cab', 'x-cab')):
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
731 for mimetype in mapping[1:]:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
732 if '/' not in mimetype:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
733 mimetype = 'application/' + mimetype
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
734 mimetype_map[mimetype] = mapping[0]
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
735
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
736 magic_mime_map = {}
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
737 for mapping in (('deb', 'Debian binary package'),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
738 ('cpio', 'cpio archive'),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
739 ('tar', 'POSIX tar archive'),
59
7a0aafe2fe87 [svn] Find self-extracting archives by their file magic only, not extension/mimetype.
brett
parents: 58
diff changeset
740 ('zip', '(Zip|ZIP self-extracting) archive'),
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
741 ('rpm', 'RPM'),
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
742 ('7z', '7-zip archive'),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
743 ('cab', 'Microsoft Cabinet archive')):
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
744 for pattern in mapping[1:]:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
745 magic_mime_map[re.compile(pattern)] = mapping[0]
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
746
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
747 magic_encoding_map = {}
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
748 for mapping in (('bzip2', 'bzip2 compressed'),
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
749 ('gzip', 'gzip compressed')):
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
750 for pattern in mapping[1:]:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
751 magic_encoding_map[re.compile(pattern)] = mapping[0]
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
752
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
753 extension_map = {}
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
754 for mapping in (('tar', 'bzip2', 'tar.bz2'),
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
755 ('tar', 'gzip', 'tar.gz', 'tgz'),
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
756 ('tar', None, 'tar'),
59
7a0aafe2fe87 [svn] Find self-extracting archives by their file magic only, not extension/mimetype.
brett
parents: 58
diff changeset
757 ('zip', None, 'zip'),
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
758 ('deb', None, 'deb'),
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
759 ('rpm', None, 'rpm'),
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
760 ('cpio', None, 'cpio'),
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
761 ('gem', None, 'gem'),
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
762 ('compress', 'gzip', 'Z', 'gz'),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
763 ('compress', 'bzip2', 'bz2'),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
764 ('compress', 'lzma', 'lzma'),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
765 ('7z', None, '7z'),
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
766 ('cab', None, 'cab', 'exe')):
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
767 for extension in mapping[2:]:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
768 extension_map.setdefault(extension, []).append(mapping[:2])
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
769
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
770 def __init__(self, filename, options):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
771 self.filename = filename
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
772 self.options = options
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
773
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
774 def build_extractor(self, archive_type, encoding):
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
775 extractors = self.extractor_map[archive_type]
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
776 if self.options.metadata and (extractors[1] is not None):
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
777 extractor = extractors[1]
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
778 else:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
779 extractor = extractors[0]
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
780 return extractor(self.filename, encoding)
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
781
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
782 def get_extractor(self):
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
783 tried_types = set()
36
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
784 # As smart as it is, the magic test can't go first, because at least
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
785 # on my system it just recognizes gem files as tar files. I guess
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
786 # it's possible for the opposite problem to occur -- where the mimetype
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
787 # or extension suggests something less than ideal -- but it seems less
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
788 # likely so I'm sticking with this.
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
789 for func_name in ('mimetype', 'extension', 'magic'):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
790 logger.debug("getting extractors by %s" % (func_name,))
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
791 extractor_types = \
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
792 getattr(self, 'try_by_' + func_name)(self.filename)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
793 logger.debug("done getting extractors")
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
794 for ext_args in extractor_types:
36
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
795 if ext_args in tried_types:
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
796 continue
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
797 tried_types.add(ext_args)
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
798 logger.debug("trying %s extractor from %s" %
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
799 (ext_args, func_name))
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
800 yield self.build_extractor(*ext_args)
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
801
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
802 def try_by_mimetype(cls, filename):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
803 mimetype, encoding = mimetypes.guess_type(filename)
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
804 try:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
805 return [(cls.mimetype_map[mimetype], encoding)]
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
806 except KeyError:
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
807 if encoding:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
808 return [('compress', encoding)]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
809 return []
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
810 try_by_mimetype = classmethod(try_by_mimetype)
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
811
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
812 def magic_map_matches(cls, output, magic_map):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
813 return [result for regexp, result in magic_map.items()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
814 if regexp.search(output)]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
815 magic_map_matches = classmethod(magic_map_matches)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
816
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
817 def try_by_magic(cls, filename):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
818 process = subprocess.Popen(['file', '-z', filename],
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
819 stdout=subprocess.PIPE)
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
820 status = process.wait()
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
821 if status != 0:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
822 return []
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
823 output = process.stdout.readline()
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
824 process.stdout.close()
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
825 if output.startswith('%s: ' % filename):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
826 output = output[len(filename) + 2:]
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
827 mimes = cls.magic_map_matches(output, cls.magic_mime_map)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
828 encodings = cls.magic_map_matches(output, cls.magic_encoding_map)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
829 if mimes and not encodings:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
830 encodings = [None]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
831 elif encodings and not mimes:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
832 mimes = ['compress']
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
833 return [(m, e) for m in mimes for e in encodings]
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
834 try_by_magic = classmethod(try_by_magic)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
835
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
836 def try_by_extension(cls, filename):
43
4591a32eedc8 [svn] Sadly Python 2.3 does not have an rsplit method on strings.
brett
parents: 42
diff changeset
837 parts = filename.split('.')[-2:]
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
838 results = []
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
839 while parts:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
840 results.extend(cls.extension_map.get('.'.join(parts), []))
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
841 del parts[0]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
842 return results
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
843 try_by_extension = classmethod(try_by_extension)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
844
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
845
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
846 class BaseAction(object):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
847 def __init__(self, options, filenames):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
848 self.options = options
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
849 self.filenames = filenames
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
850 self.target = None
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
851
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
852 def report(self, function, *args):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
853 try:
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
854 error = function(*args)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
855 except EXTRACTION_ERRORS, exception:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
856 error = str(exception)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
857 logger.debug(''.join(traceback.format_exception(*sys.exc_info())))
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
858 return error
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
859
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
860
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
861 class ExtractionAction(BaseAction):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
862 handlers = [FlatHandler, OverwriteHandler, MatchHandler, EmptyHandler,
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
863 BombHandler]
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
864
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
865 def __init__(self, options, filenames):
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
866 BaseAction.__init__(self, options, filenames)
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
867 self.did_print = False
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
868
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
869 def get_handler(self, extractor):
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
870 if extractor.content_type in ONE_ENTRY_UNKNOWN:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
871 self.options.one_entry_policy.prep(self.current_filename,
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
872 extractor)
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
873 for handler in self.handlers:
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
874 if handler.can_handle(extractor.content_type, self.options):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
875 logger.debug("using %s handler" % (handler.__name__,))
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
876 self.current_handler = handler(extractor, self.options)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
877 break
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
878
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
879 def show_extraction(self, extractor):
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
880 if self.options.log_level > logging.INFO:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
881 return
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
882 elif self.did_print:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
883 print
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
884 else:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
885 self.did_print = True
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
886 print "%s:" % (self.current_filename,)
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
887 if extractor.contents is None:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
888 print self.current_handler.target
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
889 return
55
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
890 def reverser(x, y):
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
891 return cmp(y, x)
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
892 if self.current_handler.target == '.':
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
893 filenames = extractor.contents
55
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
894 filenames.sort(reverser)
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
895 else:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
896 filenames = [self.current_handler.target]
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
897 pathjoin = os.path.join
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
898 isdir = os.path.isdir
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
899 while filenames:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
900 filename = filenames.pop()
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
901 if isdir(filename):
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
902 print "%s/" % (filename,)
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
903 new_filenames = os.listdir(filename)
55
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
904 new_filenames.sort(reverser)
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
905 filenames.extend([pathjoin(filename, new_filename)
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
906 for new_filename in new_filenames])
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
907 else:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
908 print filename
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
909
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
910 def run(self, filename, extractor):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
911 self.current_filename = filename
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
912 error = (self.report(extractor.extract) or
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
913 self.report(self.get_handler, extractor) or
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
914 self.report(self.current_handler.handle) or
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
915 self.report(self.show_extraction, extractor))
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
916 if not error:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
917 self.target = self.current_handler.target
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
918 return error
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
919
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
920
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
921 class ListAction(BaseAction):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
922 def __init__(self, options, filenames):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
923 BaseAction.__init__(self, options, filenames)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
924 self.count = 0
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
925
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
926 def get_list(self, extractor):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
927 # Note: The reason I'm getting all the filenames up front is
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
928 # because if we run into trouble partway through the archive, we'll
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
929 # try another extractor. So before we display anything we have to
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
930 # be sure this one is successful. We maybe don't have to be quite
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
931 # this conservative but this is the easy way out for now.
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
932 self.filelist = list(extractor.get_filenames())
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
933
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
934 def show_list(self, filename):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
935 self.count += 1
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
936 if len(self.filenames) != 1:
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
937 if self.count > 1:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
938 print
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
939 print "%s:" % (filename,)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
940 print '\n'.join(self.filelist)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
941
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
942 def run(self, filename, extractor):
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
943 return (self.report(self.get_list, extractor) or
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
944 self.report(self.show_list, filename))
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
945
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
946
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
947 class ExtractorApplication(object):
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
948 def __init__(self, arguments):
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
949 for signal_num in (signal.SIGINT, signal.SIGTERM):
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
950 signal.signal(signal_num, self.abort)
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
951 self.parse_options(arguments)
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
952 self.setup_logger()
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
953 self.successes = []
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
954 self.failures = []
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
955
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
956 def abort(self, signal_num, frame):
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
957 signal.signal(signal_num, signal.SIG_IGN)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
958 print
49
c76dd2716113 [svn] Add a traceback when we catch a signal.
brett
parents: 48
diff changeset
959 logger.debug("traceback:\n" +
c76dd2716113 [svn] Add a traceback when we catch a signal.
brett
parents: 48
diff changeset
960 ''.join(traceback.format_stack(frame)).rstrip())
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
961 logger.debug("got signal %s; cleaning up" % (signal_num,))
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
962 clean_targets = set([os.path.realpath('.')])
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
963 if hasattr(self, 'current_directory'):
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
964 clean_targets.add(os.path.realpath(self.current_directory))
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
965 for directory in clean_targets:
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
966 os.chdir(directory)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
967 for path in glob.glob('.dtrx-*'):
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
968 try:
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
969 os.unlink(path)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
970 except OSError, error:
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
971 if error.errno == errno.EISDIR:
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
972 shutil.rmtree(path, ignore_errors=True)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
973 sys.exit(1)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
974
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
975 def parse_options(self, arguments):
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
976 parser = optparse.OptionParser(
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
977 usage="%prog [options] archive [archive2 ...]",
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
978 description="Intelligent archive extractor",
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
979 version=VERSION_BANNER
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
980 )
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
981 parser.add_option('-r', '--recursive', dest='recursive',
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
982 action='store_true', default=False,
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
983 help='extract archives contained in the ones listed')
13
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
984 parser.add_option('-q', '--quiet', dest='quiet',
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
985 action='count', default=3,
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
986 help='suppress warning/error messages')
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
987 parser.add_option('-v', '--verbose', dest='verbose',
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
988 action='count', default=0,
0a3ef1b9f6d4 [svn] Add options to tweak the logging level to taste.
brett
parents: 12
diff changeset
989 help='be verbose/print debugging information')
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
990 parser.add_option('-o', '--overwrite', dest='overwrite',
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
991 action='store_true', default=False,
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
992 help='overwrite any existing target directory')
15
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
993 parser.add_option('-f', '--flat', '--no-directory', dest='flat',
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
994 action='store_true', default=False,
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
995 help="don't put contents in their own directory")
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
996 parser.add_option('-l', '-t', '--list', '--table', dest='show_list',
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
997 action='store_true', default=False,
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
998 help="list contents of archives on standard output")
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
999 parser.add_option('-n', '--noninteractive', dest='batch',
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
1000 action='store_true', default=False,
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
1001 help="don't ask how to handle special cases")
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
1002 parser.add_option('-m', '--metadata', dest='metadata',
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
1003 action='store_true', default=False,
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
1004 help="extract metadata from a .deb/.gem")
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1005 self.options, filenames = parser.parse_args(arguments)
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1006 if not filenames:
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1007 parser.error("you did not list any archives")
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1008 # This makes WARNING is the default.
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1009 self.options.log_level = (10 * (self.options.quiet -
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1010 self.options.verbose))
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
1011 self.options.one_entry_policy = OneEntryPolicy(self.options)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
1012 self.options.recursion_policy = RecursionPolicy(self.options)
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1013 self.archives = {os.path.realpath(os.curdir): filenames}
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1014
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
1015 def setup_logger(self):
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1016 logging.getLogger().setLevel(self.options.log_level)
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
1017 handler = logging.StreamHandler()
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1018 handler.setLevel(self.options.log_level)
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1019 formatter = logging.Formatter("dtrx: %(levelname)s: %(message)s")
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
1020 handler.setFormatter(formatter)
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1021 logger.addHandler(handler)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
1022 logger.debug("logger is set up")
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1023
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1024 def recurse(self, filename, extractor, action):
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
1025 self.options.recursion_policy.prep(filename, action.target, extractor)
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
1026 if self.options.recursion_policy.ok_to_recurse():
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
1027 for filename in extractor.included_archives:
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
1028 tail_path, basename = os.path.split(filename)
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
1029 directory = os.path.join(self.current_directory, action.target,
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
1030 extractor.included_root, tail_path)
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
1031 self.archives.setdefault(directory, []).append(basename)
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
1032
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1033 def check_file(self, filename):
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1034 try:
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1035 result = os.stat(filename)
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1036 except OSError, error:
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1037 return error.strerror
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1038 if stat.S_ISDIR(result.st_mode):
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1039 return "cannot extract a directory"
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1040
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1041 def try_extractors(self, filename, builder):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1042 errors = []
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1043 for extractor in builder:
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1044 error = self.action.run(filename, extractor)
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1045 if error:
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1046 errors.append((extractor.file_type, extractor.encoding, error))
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1047 else:
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1048 self.recurse(filename, extractor, self.action)
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1049 return
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1050 logger.error("could not handle %s" % (filename,))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1051 if not errors:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1052 logger.error("not a known archive type")
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1053 return True
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1054 for file_type, encoding, error in errors:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1055 message = ["treating as", file_type, "failed:", error]
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1056 if encoding:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1057 message.insert(1, "%s-encoded" % (encoding,))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1058 logger.error(' '.join(message))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1059 return True
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1060
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1061 def run(self):
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1062 if self.options.show_list:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1063 action = ListAction
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1064 else:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1065 action = ExtractionAction
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1066 self.action = action(self.options, self.archives.values()[0])
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1067 while self.archives:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1068 self.current_directory, self.filenames = self.archives.popitem()
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1069 os.chdir(self.current_directory)
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1070 for filename in self.filenames:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1071 builder = ExtractorBuilder(filename, self.options)
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1072 error = (self.check_file(filename) or
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1073 self.try_extractors(filename, builder.get_extractor()))
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1074 if error:
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1075 if error != True:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1076 logger.error("%s: %s" % (filename, error))
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1077 self.failures.append(filename)
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1078 else:
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1079 self.successes.append(filename)
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1080 self.options.one_entry_policy.permanent_policy = EXTRACT_WRAP
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1081 if self.failures:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1082 return 1
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1083 return 0
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1084
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1085
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1086 if __name__ == '__main__':
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1087 app = ExtractorApplication(sys.argv[1:])
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1088 sys.exit(app.run())

mercurial