scripts/dtrx

Wed, 13 Jan 2010 19:39:36 +0000

author
Matthew Wild <mwild1@gmail.com>
date
Wed, 13 Jan 2010 19:39:36 +0000
branch
trunk
changeset 125
c4495fc7d00d
parent 123
8570c14304bb
permissions
-rwxr-xr-x

Add support for fetching archives from URLs

1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1 #!/usr/bin/env python
92
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
2 # -*- coding: utf-8 -*-
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
3 #
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
4 # dtrx -- Intelligently extract various archive types.
121
957d19158b7e release dtrx 6.5
Brett Smith <brettcsmith@brettcsmith.org>
parents: 117
diff changeset
5 # Copyright © 2006-2009 Brett Smith <brettcsmith@brettcsmith.org>
114
d2a28fe2a8ff use a real UTF-8 copyright symbol everywhere
Brett Smith <brettcsmith@brettcsmith.org>
parents: 111
diff changeset
6 # Copyright © 2008 Peter Kelemen <Peter.Kelemen@gmail.com>
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
7 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
8 # This program is free software; you can redistribute it and/or modify it
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
9 # under the terms of the GNU General Public License as published by the
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
10 # Free Software Foundation; either version 3 of the License, or (at your
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
11 # option) any later version.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
12 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
13 # This program is distributed in the hope that it will be useful, but
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
14 # WITHOUT ANY WARRANTY; without even the implied warranty of
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
15 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
16 # Public License for more details.
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
17 #
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
18 # You should have received a copy of the GNU General Public License along
42
4a4cab75d5e6 [svn] Update documentation.
brett
parents: 41
diff changeset
19 # with this program; if not, see <http://www.gnu.org/licenses/>.
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
20
72
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
21 # Python 2.3 string methods: 'rfind', 'rindex', 'rjust', 'rstrip'
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
22
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
23 import errno
103
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
24 import fcntl
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
25 import logging
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
26 import mimetypes
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
27 import optparse
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
28 import os
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
29 import re
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
30 import shutil
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
31 import signal
15
28dbd52a8bb8 [svn] Add a -f/--flat option, which will extract the archive contents into the
brett
parents: 14
diff changeset
32 import stat
110
6cbe6cb5a903 Use string.rindex (which is in Python 2.3) instead of my own version.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 109
diff changeset
33 import string
103
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
34 import struct
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
35 import subprocess
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
36 import sys
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
37 import tempfile
103
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
38 import termios
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
39 import textwrap
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
40 import traceback
125
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
41 import urllib
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
42
94
6cdbdffa2e2e Avoid DeprecationWarning under Python 2.6. Part of the 6.3 release.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 93
diff changeset
43 try:
6cdbdffa2e2e Avoid DeprecationWarning under Python 2.6. Part of the 6.3 release.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 93
diff changeset
44 set
6cdbdffa2e2e Avoid DeprecationWarning under Python 2.6. Part of the 6.3 release.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 93
diff changeset
45 except NameError:
6cdbdffa2e2e Avoid DeprecationWarning under Python 2.6. Part of the 6.3 release.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 93
diff changeset
46 from sets import Set as set
36
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
47
123
8570c14304bb add support for xz compression
Brett Smith <brettcsmith@brettcsmith.org>
parents: 121
diff changeset
48 VERSION = "6.6"
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
49 VERSION_BANNER = """dtrx version %s
121
957d19158b7e release dtrx 6.5
Brett Smith <brettcsmith@brettcsmith.org>
parents: 117
diff changeset
50 Copyright © 2006-2009 Brett Smith <brettcsmith@brettcsmith.org>
117
c43771363c6f convert more copyright symbols
Brett Smith <brettcsmith@brettcsmith.org>
parents: 115
diff changeset
51 Copyright © 2008 Peter Kelemen <Peter.Kelemen@gmail.com>
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
52
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
53 This program is free software; you can redistribute it and/or modify it
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
54 under the terms of the GNU General Public License as published by the
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
55 Free Software Foundation; either version 3 of the License, or (at your
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
56 option) any later version.
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
57
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
58 This program is distributed in the hope that it will be useful, but
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
59 WITHOUT ANY WARRANTY; without even the implied warranty of
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
60 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
61 Public License for more details.""" % (VERSION,)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
62
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
63 MATCHING_DIRECTORY = 1
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
64 ONE_ENTRY_KNOWN = 2
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
65 BOMB = 3
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
66 EMPTY = 4
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
67 ONE_ENTRY_FILE = 'file'
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
68 ONE_ENTRY_DIRECTORY = 'directory'
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
69
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
70 ONE_ENTRY_UNKNOWN = [ONE_ENTRY_FILE, ONE_ENTRY_DIRECTORY]
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
71
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
72 EXTRACT_HERE = 1
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
73 EXTRACT_WRAP = 2
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
74 EXTRACT_RENAME = 3
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
75
23
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
76 RECURSE_ALWAYS = 1
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
77 RECURSE_ONCE = 2
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
78 RECURSE_NOT_NOW = 3
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
79 RECURSE_NEVER = 4
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
80 RECURSE_LIST = 5
23
039dd321a7d0 [svn] If an archive contains other archives, and the user didn't specify that
brett
parents: 22
diff changeset
81
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
82 mimetypes.encodings_map.setdefault('.bz2', 'bzip2')
34
a8f875e02c83 [svn] Add support for LZMA compression. Holy crap that was easy.
brett
parents: 33
diff changeset
83 mimetypes.encodings_map.setdefault('.lzma', 'lzma')
123
8570c14304bb add support for xz compression
Brett Smith <brettcsmith@brettcsmith.org>
parents: 121
diff changeset
84 mimetypes.encodings_map.setdefault('.xz', 'xz')
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
85 mimetypes.types_map.setdefault('.gem', 'application/x-ruby-gem')
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
86
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
87 logger = logging.getLogger('dtrx-log')
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
88
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
89 class FilenameChecker(object):
58
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
90 free_func = os.open
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
91 free_args = (os.O_CREAT | os.O_EXCL,)
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
92 free_close = os.close
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
93
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
94 def __init__(self, original_name):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
95 self.original_name = original_name
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
96
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
97 def is_free(self, filename):
58
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
98 try:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
99 result = self.free_func(filename, *self.free_args)
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
100 except OSError, error:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
101 if error.errno == errno.EEXIST:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
102 return False
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
103 raise
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
104 if self.free_close:
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
105 self.free_close(result)
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
106 return True
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
107
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
108 def create(self):
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
109 fd, filename = tempfile.mkstemp(prefix=self.original_name + '.',
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
110 dir='.')
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
111 os.close(fd)
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
112 return filename
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
113
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
114 def check(self):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
115 for suffix in [''] + ['.%s' % (x,) for x in range(1, 10)]:
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
116 filename = '%s%s' % (self.original_name, suffix)
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
117 if self.is_free(filename):
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
118 return filename
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
119 return self.create()
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
120
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
121
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
122 class DirectoryChecker(FilenameChecker):
58
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
123 free_func = os.mkdir
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
124 free_args = ()
16506464d57b [svn] Steel FilenameChecker against race conditions.
brett
parents: 55
diff changeset
125 free_close = None
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
126
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
127 def create(self):
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
128 return tempfile.mkdtemp(prefix=self.original_name + '.', dir='.')
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
129
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
130
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
131 class ExtractorError(Exception):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
132 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
133
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
134
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
135 class ExtractorUnusable(Exception):
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
136 pass
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
137
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
138
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
139 EXTRACTION_ERRORS = (ExtractorError, ExtractorUnusable, OSError, IOError)
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
140
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
141 class BaseExtractor(object):
34
a8f875e02c83 [svn] Add support for LZMA compression. Holy crap that was easy.
brett
parents: 33
diff changeset
142 decoders = {'bzip2': 'bzcat', 'gzip': 'zcat', 'compress': 'zcat',
123
8570c14304bb add support for xz compression
Brett Smith <brettcsmith@brettcsmith.org>
parents: 121
diff changeset
143 'lzma': 'lzcat', 'xz': 'xzcat'}
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
144 name_checker = DirectoryChecker
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
145
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
146 def __init__(self, filename, encoding):
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
147 if encoding and (not self.decoders.has_key(encoding)):
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
148 raise ValueError("unrecognized encoding %s" % (encoding,))
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
149 self.filename = os.path.realpath(filename)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
150 self.encoding = encoding
69
35a2f45cdd3b Count files in the archive and report that in the recursion prompt.
Brett Smith <brett@brettcsmith.org>
parents: 66
diff changeset
151 self.file_count = 0
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
152 self.included_archives = []
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
153 self.target = None
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
154 self.content_type = None
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
155 self.content_name = None
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
156 self.pipes = []
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
157 self.stderr = tempfile.TemporaryFile()
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
158 self.exit_codes = []
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
159 try:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
160 self.archive = open(filename, 'r')
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
161 except (IOError, OSError), error:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
162 raise ExtractorError("could not open %s: %s" %
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
163 (filename, error.strerror))
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
164 if encoding:
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
165 self.pipe([self.decoders[encoding]], "decoding")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
166 self.prepare()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
167
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
168 def pipe(self, command, description="extraction"):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
169 self.pipes.append((command, description))
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
170
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
171 def first_bad_exit_code(self):
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
172 for index, code in enumerate(self.exit_codes):
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
173 if code != 0:
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
174 return index
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
175 return None
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
176
106
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
177 def add_process(self, processes, command, stdin, stdout):
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
178 try:
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
179 processes.append(subprocess.Popen(command, stdin=stdin,
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
180 stdout=stdout,
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
181 stderr=self.stderr))
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
182 except OSError, error:
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
183 if error.errno == errno.ENOENT:
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
184 raise ExtractorUnusable("could not run %s" % (command[0],))
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
185 raise
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
186
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
187 def run_pipes(self, final_stdout=None):
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
188 if not self.pipes:
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
189 return
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
190 elif final_stdout is None:
106
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
191 final_stdout = open('/dev/null', 'w')
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
192 num_pipes = len(self.pipes)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
193 last_pipe = num_pipes - 1
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
194 processes = []
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
195 for index, command in enumerate([pipe[0] for pipe in self.pipes]):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
196 if index == 0:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
197 stdin = self.archive
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
198 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
199 stdin = processes[-1].stdout
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
200 if index == last_pipe:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
201 stdout = final_stdout
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
202 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
203 stdout = subprocess.PIPE
106
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
204 self.add_process(processes, command, stdin, stdout)
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
205 self.exit_codes = [pipe.wait() for pipe in processes]
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
206 self.archive.close()
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
207 for index in range(last_pipe):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
208 processes[index].stdout.close()
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
209 self.archive = final_stdout
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
210
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
211 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
212 pass
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
213
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
214 def check_included_archives(self):
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
215 if (self.content_name is None) or (not self.content_name.endswith('/')):
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
216 self.included_root = './'
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
217 else:
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
218 self.included_root = self.content_name
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
219 start_index = len(self.included_root)
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
220 for path, dirname, filenames in os.walk(self.included_root):
69
35a2f45cdd3b Count files in the archive and report that in the recursion prompt.
Brett Smith <brett@brettcsmith.org>
parents: 66
diff changeset
221 self.file_count += len(filenames)
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
222 path = path[start_index:]
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
223 for filename in filenames:
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
224 if (ExtractorBuilder.try_by_mimetype(filename) or
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
225 ExtractorBuilder.try_by_extension(filename)):
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
226 self.included_archives.append(os.path.join(path, filename))
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
227
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
228 def check_contents(self):
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
229 if not self.contents:
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
230 self.content_type = EMPTY
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
231 elif len(self.contents) == 1:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
232 if self.basename() == self.contents[0]:
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
233 self.content_type = MATCHING_DIRECTORY
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
234 elif os.path.isdir(self.contents[0]):
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
235 self.content_type = ONE_ENTRY_DIRECTORY
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
236 else:
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
237 self.content_type = ONE_ENTRY_FILE
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
238 self.content_name = self.contents[0]
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
239 if os.path.isdir(self.contents[0]):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
240 self.content_name += '/'
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
241 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
242 self.content_type = BOMB
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
243 self.check_included_archives()
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
244
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
245 def basename(self):
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
246 pieces = os.path.basename(self.filename).split('.')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
247 extension = '.' + pieces[-1]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
248 if mimetypes.encodings_map.has_key(extension):
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
249 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
250 extension = '.' + pieces[-1]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
251 if (mimetypes.types_map.has_key(extension) or
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
252 mimetypes.common_types.has_key(extension) or
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
253 mimetypes.suffix_map.has_key(extension)):
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
254 pieces.pop()
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
255 return '.'.join(pieces)
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
256
78
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
257 def get_stderr(self):
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
258 self.stderr.seek(0, 0)
78
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
259 errors = self.stderr.read(-1)
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
260 self.stderr.close()
78
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
261 return errors
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
262
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
263 def check_success(self, got_output):
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
264 error_index = self.first_bad_exit_code()
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
265 if (not got_output) and (error_index is not None):
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
266 command = ' '.join(self.pipes[error_index][0])
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
267 raise ExtractorError("%s error: '%s' returned status code %s" %
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
268 (self.pipes[error_index][1], command,
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
269 self.exit_codes[error_index]))
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
270
80
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
271 def extract_archive(self):
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
272 self.pipe(self.extract_pipe)
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
273 self.run_pipes()
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
274
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
275 def extract(self):
40
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
276 try:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
277 self.target = tempfile.mkdtemp(prefix='.dtrx-', dir='.')
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
278 except (OSError, IOError), error:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
279 raise ExtractorError("cannot extract here: %s" % (error.strerror,))
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
280 old_path = os.path.realpath(os.curdir)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
281 os.chdir(self.target)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
282 try:
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
283 self.archive.seek(0, 0)
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
284 self.extract_archive()
91
ececf7836546 Make sure all extractors get self.contents defined.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
285 self.contents = os.listdir('.')
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
286 self.check_contents()
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
287 self.check_success(self.content_type != EMPTY)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
288 except EXTRACTION_ERRORS:
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
289 self.archive.close()
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
290 os.chdir(old_path)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
291 shutil.rmtree(self.target, ignore_errors=True)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
292 raise
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
293 self.archive.close()
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
294 os.chdir(old_path)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
295
108
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
296 def get_filenames(self, internal=False):
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
297 if not internal:
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
298 self.pipe(self.list_pipe, "listing")
106
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
299 processes = []
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
300 stdin = self.archive
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
301 for command in [pipe[0] for pipe in self.pipes]:
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
302 self.add_process(processes, command, stdin, subprocess.PIPE)
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
303 stdin = processes[-1].stdout
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
304 get_output_line = processes[-1].stdout.readline
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
305 while True:
106
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
306 line = get_output_line()
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
307 if not line:
106
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
308 break
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
309 yield line.rstrip('\n')
106
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
310 self.exit_codes = [pipe.wait() for pipe in processes]
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
311 self.archive.close()
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
312 for process in processes:
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
313 process.stdout.close()
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
314 self.check_success(False)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
315
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
316
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
317 class CompressionExtractor(BaseExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
318 file_type = 'compressed file'
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
319 name_checker = FilenameChecker
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
320
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
321 def basename(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
322 pieces = os.path.basename(self.filename).split('.')
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
323 extension = '.' + pieces[-1]
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
324 if mimetypes.encodings_map.has_key(extension):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
325 pieces.pop()
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
326 return '.'.join(pieces)
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
327
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
328 def get_filenames(self):
79
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
329 # This code used to just immediately yield the basename, under the
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
330 # assumption that that would be the filename. However, if that
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
331 # happens, dtrx -l will report this as a valid result for files with
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
332 # compression extensions, even if those files shouldn't actually be
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
333 # handled this way. So, we call out to the file command to do a quick
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
334 # check and make sure this actually looks like a compressed file.
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
335 if 'compress' not in [match[0] for match in
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
336 ExtractorBuilder.try_by_magic(self.filename)]:
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
337 raise ExtractorError("doesn't look like a compressed file")
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
338 yield self.basename()
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
339
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
340 def extract(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
341 self.content_type = ONE_ENTRY_KNOWN
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
342 self.content_name = self.basename()
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
343 self.contents = None
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
344 self.included_root = './'
40
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
345 try:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
346 output_fd, self.target = tempfile.mkstemp(prefix='.dtrx-', dir='.')
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
347 except (OSError, IOError), error:
ee6a869f8da1 [svn] Be a little nicer about explaining that we can't extract to the current
brett
parents: 39
diff changeset
348 raise ExtractorError("cannot extract here: %s" % (error.strerror,))
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
349 self.run_pipes(output_fd)
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
350 os.close(output_fd)
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
351 try:
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
352 self.check_success(os.stat(self.target)[stat.ST_SIZE] > 0)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
353 except EXTRACTION_ERRORS:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
354 os.unlink(self.target)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
355 raise
62
17d845dacff5 Deal with partially extracted tarballs.
Brett Smith <brett@brettcsmith.org>
parents: 59
diff changeset
356
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
357 class TarExtractor(BaseExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
358 file_type = 'tar file'
80
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
359 extract_pipe = ['tar', '-x']
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
360 list_pipe = ['tar', '-t']
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
361
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
362
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
363 class CpioExtractor(BaseExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
364 file_type = 'cpio file'
80
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
365 extract_pipe = ['cpio', '-i', '--make-directories', '--quiet',
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
366 '--no-absolute-filenames']
83
cb56c72f3d42 Use the --quiet option for cpio -t too.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 82
diff changeset
367 list_pipe = ['cpio', '-t', '--quiet']
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
368
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
369
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
370 class RPMExtractor(CpioExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
371 file_type = 'RPM'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
372
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
373 def prepare(self):
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
374 self.pipe(['rpm2cpio', '-'], "rpm2cpio")
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
375
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
376 def basename(self):
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
377 pieces = os.path.basename(self.filename).split('.')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
378 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
379 return pieces[0]
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
380 elif pieces[-1] != 'rpm':
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
381 return BaseExtractor.basename(self)
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
382 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
383 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
384 return pieces[0]
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
385 elif len(pieces[-1]) < 8:
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
386 pieces.pop()
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
387 return '.'.join(pieces)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
388
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
389 def check_contents(self):
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
390 self.check_included_archives()
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
391 self.content_type = BOMB
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
392
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
393
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
394 class DebExtractor(TarExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
395 file_type = 'Debian package'
108
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
396 data_re = re.compile(r'^data\.tar\.[a-z0-9]+$')
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
397
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
398 def prepare(self):
108
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
399 self.pipe(['ar', 't', self.filename], "finding package data file")
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
400 for filename in self.get_filenames(internal=True):
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
401 if self.data_re.match(filename):
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
402 data_filename = filename
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
403 break
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
404 else:
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
405 raise ExtractorError(".deb contains no data.tar file")
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
406 self.archive.seek(0, 0)
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
407 self.pipes.pop()
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
408 # self.pipes = start_pipes
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
409 encoding = mimetypes.guess_type(data_filename)[1]
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
410 if not encoding:
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
411 raise ExtractorError("data.tar file has unrecognized encoding")
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
412 self.pipe(['ar', 'p', self.filename, data_filename],
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
413 "extracting data.tar from .deb")
b8316c2b36df Add support for new .deb archives with data.tar.{bz2,lzma}.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 106
diff changeset
414 self.pipe([self.decoders[encoding]], "decoding data.tar")
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
415
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
416 def basename(self):
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
417 pieces = os.path.basename(self.filename).split('_')
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
418 if len(pieces) == 1:
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
419 return pieces[0]
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
420 last_piece = pieces.pop()
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
421 if (len(last_piece) > 10) or (not last_piece.endswith('.deb')):
2
1570351bf863 [svn] Fix a small bug that would crash the program if an archive was empty.
brett
parents: 1
diff changeset
422 return BaseExtractor.basename(self)
9
920417b8acc9 [svn] Fix issues with basename methods. First, string's rsplit method only
brett
parents: 8
diff changeset
423 return '_'.join(pieces)
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
424
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
425 def check_contents(self):
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
426 self.check_included_archives()
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
427 self.content_type = BOMB
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
428
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
429
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
430 class DebMetadataExtractor(DebExtractor):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
431 def prepare(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
432 self.pipe(['ar', 'p', self.filename, 'control.tar.gz'],
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
433 "control.tar.gz extraction")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
434 self.pipe(['zcat'], "control.tar.gz decompression")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
435
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
436
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
437 class GemExtractor(TarExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
438 file_type = 'Ruby gem'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
439
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
440 def prepare(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
441 self.pipe(['tar', '-xO', 'data.tar.gz'], "data.tar.gz extraction")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
442 self.pipe(['zcat'], "data.tar.gz decompression")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
443
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
444 def check_contents(self):
47
b034b6b4227d [svn] Fix various bugs in the recursive extraction.
brett
parents: 46
diff changeset
445 self.check_included_archives()
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
446 self.content_type = BOMB
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
447
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
448
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
449 class GemMetadataExtractor(CompressionExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
450 file_type = 'Ruby gem'
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
451
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
452 def prepare(self):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
453 self.pipe(['tar', '-xO', 'metadata.gz'], "metadata.gz extraction")
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
454 self.pipe(['zcat'], "metadata.gz decompression")
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
455
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
456 def basename(self):
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
457 return os.path.basename(self.filename) + '-metadata.txt'
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
458
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
459
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
460 class NoPipeExtractor(BaseExtractor):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
461 # Some extraction tools won't accept the archive from stdin. With
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
462 # these, the piping infrastructure we normally set up generally doesn't
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
463 # work, at least at first. We can still use most of it; we just don't
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
464 # want to seed self.archive with the archive file, since that sucks up
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
465 # memory. So instead we seed it with /dev/null, and specify the
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
466 # filename on the command line as necessary. We also open the actual
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
467 # file with os.open, to make sure we can actually do it (permissions
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
468 # are good, etc.). This class doesn't do anything by itself; it's just
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
469 # meant to be a base class for extractors that rely on these dumb
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
470 # tools.
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
471 def __init__(self, filename, encoding):
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
472 os.close(os.open(filename, os.O_RDONLY))
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
473 BaseExtractor.__init__(self, '/dev/null', None)
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
474 self.filename = os.path.realpath(filename)
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
475
80
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
476 def extract_archive(self):
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
477 self.extract_pipe = self.extract_command + [self.filename]
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
478 BaseExtractor.extract_archive(self)
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
479
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
480 def get_filenames(self):
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
481 self.list_pipe = self.list_command + [self.filename]
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
482 return BaseExtractor.get_filenames(self)
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
483
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
484
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
485 class ZipExtractor(NoPipeExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
486 file_type = 'Zip file'
80
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
487 extract_command = ['unzip', '-q']
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
488 list_command = ['zipinfo', '-1']
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
489
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
490
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
491 class SevenExtractor(NoPipeExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
492 file_type = '7z file'
80
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
493 extract_command = ['7z', 'x']
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
494 list_command = ['7z', 'l']
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
495 border_re = re.compile('^[- ]+$')
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
496
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
497 def get_filenames(self):
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
498 fn_index = None
79
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
499 for line in NoPipeExtractor.get_filenames(self):
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
500 if self.border_re.match(line):
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
501 if fn_index is not None:
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
502 break
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
503 else:
110
6cbe6cb5a903 Use string.rindex (which is in Python 2.3) instead of my own version.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 109
diff changeset
504 fn_index = string.rindex(line, ' ') + 1
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
505 elif fn_index is not None:
82
6db35db38795 Stop worrying about trailing newlines in get_filenames() overrides.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 81
diff changeset
506 yield line[fn_index:]
32
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
507 self.archive.close()
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
508
ec4c845695b3 [svn] Oops, finish adding 7z support.
brett
parents: 31
diff changeset
509
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
510 class CABExtractor(NoPipeExtractor):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
511 file_type = 'CAB archive'
80
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
512 extract_command = ['cabextract', '-q']
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
513 list_command = ['cabextract', '-l']
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
514 border_re = re.compile(r'^[-\+]+$')
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
515
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
516 def get_filenames(self):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
517 fn_index = None
79
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
518 filenames = NoPipeExtractor.get_filenames(self)
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
519 for line in filenames:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
520 if self.border_re.match(line):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
521 break
79
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
522 for line in filenames:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
523 try:
82
6db35db38795 Stop worrying about trailing newlines in get_filenames() overrides.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 81
diff changeset
524 yield line.split(' | ', 2)[2]
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
525 except IndexError:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
526 break
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
527 self.archive.close()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
528
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
529
72
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
530 class ShieldExtractor(NoPipeExtractor):
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
531 file_type = 'InstallShield archive'
80
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
532 extract_command = ['unshield', 'x']
df9b3428e28f Move more common extraction/listing functionality into BaseExtractor.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 79
diff changeset
533 list_command = ['unshield', 'l']
72
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
534 prefix_re = re.compile(r'^\s+\d+\s+')
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
535 end_re = re.compile(r'^\s+-+\s+-+\s*$')
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
536
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
537 def get_filenames(self):
79
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
538 for line in NoPipeExtractor.get_filenames(self):
72
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
539 if self.end_re.match(line):
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
540 break
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
541 else:
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
542 match = self.prefix_re.match(line)
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
543 if match:
82
6db35db38795 Stop worrying about trailing newlines in get_filenames() overrides.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 81
diff changeset
544 yield line[match.end():]
72
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
545 self.archive.close()
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
546
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
547 def basename(self):
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
548 result = NoPipeExtractor.basename(self)
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
549 if result.endswith('.hdr'):
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
550 result = result[:-4]
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
551 return result
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
552
c4cfaf634bb9 Add support for InstallShield archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 71
diff changeset
553
92
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
554 class RarExtractor(NoPipeExtractor):
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
555 file_type = 'RAR archive'
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
556 extract_command = ['unrar', 'x']
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
557 list_command = ['unrar', 'l']
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
558 border_re = re.compile('^-+$')
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
559
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
560 def get_filenames(self):
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
561 inside = False
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
562 for line in NoPipeExtractor.get_filenames(self):
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
563 if self.border_re.match(line):
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
564 if inside:
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
565 break
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
566 else:
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
567 inside = True
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
568 elif inside:
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
569 yield line.split(' ')[1]
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
570 self.archive.close()
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
571
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
572
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
573 class BaseHandler(object):
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
574 def __init__(self, extractor, options):
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
575 self.extractor = extractor
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
576 self.options = options
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
577 self.target = None
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
578
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
579 def handle(self):
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
580 command = 'find'
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
581 status = subprocess.call(['find', self.extractor.target, '-type', 'd',
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
582 '-exec', 'chmod', 'u+rwx', '{}', ';'])
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
583 if status == 0:
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
584 command = 'chmod'
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
585 status = subprocess.call(['chmod', '-R', 'u+rwX',
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
586 self.extractor.target])
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
587 if status != 0:
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
588 return "%s returned with exit status %s" % (command, status)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
589 return self.organize()
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
590
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
591 def set_target(self, target, checker):
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
592 self.target = checker(target).check()
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
593 if self.target != target:
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
594 logger.warning("extracting %s to %s" %
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
595 (self.extractor.filename, self.target))
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
596
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
597
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
598 # The "where to extract" table, with options and archive types.
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
599 # This dictates the contents of each can_handle method.
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
600 #
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
601 # Flat Overwrite None
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
602 # File basename basename FilenameChecked
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
603 # Match . . tempdir + checked
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
604 # Bomb . basename DirectoryChecked
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
605
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
606 class FlatHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
607 def can_handle(contents, options):
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
608 return ((options.flat and (contents != ONE_ENTRY_KNOWN)) or
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
609 (options.overwrite and (contents == MATCHING_DIRECTORY)))
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
610 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
611
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
612 def organize(self):
17
481a2b4be471 [svn] Lots of tests for various boundary cases, and slightly better handling for
brett
parents: 16
diff changeset
613 self.target = '.'
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
614 for curdir, dirs, filenames in os.walk(self.extractor.target,
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
615 topdown=False):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
616 path_parts = curdir.split(os.sep)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
617 if path_parts[0] == '.':
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
618 del path_parts[1]
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
619 else:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
620 del path_parts[0]
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
621 newdir = os.path.join(*path_parts)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
622 if not os.path.isdir(newdir):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
623 os.makedirs(newdir)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
624 for filename in filenames:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
625 os.rename(os.path.join(curdir, filename),
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
626 os.path.join(newdir, filename))
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
627 os.rmdir(curdir)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
628
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
629
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
630 class OverwriteHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
631 def can_handle(contents, options):
22
b240777ae53e [svn] Improve the way we check archive contents. If all the entries look like
brett
parents: 20
diff changeset
632 return ((options.flat and (contents == ONE_ENTRY_KNOWN)) or
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
633 (options.overwrite and (contents != MATCHING_DIRECTORY)))
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
634 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
635
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
636 def organize(self):
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
637 self.target = self.extractor.basename()
51
f1789e6586d8 [svn] Don't try to rmtree when overwriting just a file.
brett
parents: 49
diff changeset
638 if os.path.isdir(self.target):
f1789e6586d8 [svn] Don't try to rmtree when overwriting just a file.
brett
parents: 49
diff changeset
639 shutil.rmtree(self.target)
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
640 os.rename(self.extractor.target, self.target)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
641
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
642
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
643 class MatchHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
644 def can_handle(contents, options):
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
645 return ((contents == MATCHING_DIRECTORY) or
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
646 ((contents in ONE_ENTRY_UNKNOWN) and
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
647 options.one_entry_policy.ok_for_match()))
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
648 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
649
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
650 def organize(self):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
651 source = os.path.join(self.extractor.target,
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
652 os.listdir(self.extractor.target)[0])
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
653 if os.path.isdir(source):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
654 checker = DirectoryChecker
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
655 else:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
656 checker = FilenameChecker
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
657 if self.options.one_entry_policy == EXTRACT_HERE:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
658 destination = self.extractor.content_name.rstrip('/')
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
659 else:
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
660 destination = self.extractor.basename()
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
661 self.set_target(destination, checker)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
662 if os.path.isdir(self.extractor.target):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
663 os.rename(source, self.target)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
664 os.rmdir(self.extractor.target)
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
665 else:
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
666 os.rename(self.extractor.target, self.target)
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
667 self.extractor.included_root = './'
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
668
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
669
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
670 class EmptyHandler(object):
100
7353b443dc98 Fix crasher bug when extracting empty archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 99
diff changeset
671 target = ''
7353b443dc98 Fix crasher bug when extracting empty archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 99
diff changeset
672
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
673 def can_handle(contents, options):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
674 return contents == EMPTY
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
675 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
676
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
677 def __init__(self, extractor, options): pass
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
678 def handle(self): pass
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
679
14
6f9e1bb59719 [svn] Add support for just decompressing files that are compressed. So, if you
brett
parents: 13
diff changeset
680
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
681 class BombHandler(BaseHandler):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
682 def can_handle(contents, options):
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
683 return True
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
684 can_handle = staticmethod(can_handle)
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
685
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
686 def organize(self):
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
687 basename = self.extractor.basename()
41
e3675644bbb6 [svn] Minor clean-ups. The most important of these is that we now have a better
brett
parents: 40
diff changeset
688 self.set_target(basename, self.extractor.name_checker)
28
4d88f2231d33 [svn] Change all the license notices from GPLv2 to GPLv3.
brett
parents: 27
diff changeset
689 os.rename(self.extractor.target, self.target)
16
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
690
29794d4d41aa [svn] There's now an entirely new object hierarchy for handlers, because the
brett
parents: 15
diff changeset
691
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
692 class BasePolicy(object):
99
1ae3722ca219 Improve wrapping of interactive prompts.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 98
diff changeset
693 try:
103
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
694 size = fcntl.ioctl(sys.stdout.fileno(), termios.TIOCGWINSZ,
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
695 struct.pack("HHHH", 0, 0, 0, 0))
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
696 width = struct.unpack("HHHH", size)[1]
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
697 except IOError:
99
1ae3722ca219 Improve wrapping of interactive prompts.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 98
diff changeset
698 width = 80
105
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
699 width = width - 1
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
700 choice_wrapper = textwrap.TextWrapper(width=width, initial_indent=' * ',
103
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
701 subsequent_indent=' ',
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
702 break_long_words=False)
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
703
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
704 def __init__(self, options):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
705 self.current_policy = None
26
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
706 if options.batch:
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
707 self.permanent_policy = self.answers['']
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
708 else:
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
709 self.permanent_policy = None
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
710
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
711 def ask_question(self, question):
103
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
712 question = question + ["You can:"]
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
713 for choice in self.choices:
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
714 question.extend(self.choice_wrapper.wrap(choice))
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
715 while True:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
716 print "\n".join(question)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
717 try:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
718 answer = raw_input(self.prompt)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
719 except EOFError:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
720 return self.answers['']
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
721 try:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
722 return self.answers[answer.lower()]
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
723 except KeyError:
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
724 print
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
725
105
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
726 def wrap(self, question, *args):
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
727 words = question.split()
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
728 for arg in args:
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
729 words[words.index('%s')] = arg
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
730 result = [words.pop(0)]
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
731 for word in words:
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
732 extend = '%s %s' % (result[-1], word)
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
733 if len(extend) > self.width:
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
734 result.append(word)
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
735 else:
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
736 result[-1] = extend
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
737 return result
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
738
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
739 def __cmp__(self, other):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
740 return cmp(self.current_policy, other)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
741
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
742
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
743 class OneEntryPolicy(BasePolicy):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
744 answers = {'h': EXTRACT_HERE, 'i': EXTRACT_WRAP, 'r': EXTRACT_RENAME,
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
745 '': EXTRACT_WRAP}
103
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
746 choice_template = ["extract the %s _I_nside a new directory named %s",
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
747 "extract the %s and _R_ename it %s",
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
748 "extract the %s _H_ere"]
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
749 prompt = "What do you want to do? (I/r/h) "
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
750
66
af0b822b012e Don't prompt for one entry handling with -f.
Brett Smith <brett@brettcsmith.org>
parents: 65
diff changeset
751 def __init__(self, options):
af0b822b012e Don't prompt for one entry handling with -f.
Brett Smith <brett@brettcsmith.org>
parents: 65
diff changeset
752 BasePolicy.__init__(self, options)
af0b822b012e Don't prompt for one entry handling with -f.
Brett Smith <brett@brettcsmith.org>
parents: 65
diff changeset
753 if options.flat:
84
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
754 default = 'h'
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
755 elif options.one_entry_default is not None:
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
756 default = options.one_entry_default.lower()
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
757 else:
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
758 return
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
759 if 'here'.startswith(default):
66
af0b822b012e Don't prompt for one entry handling with -f.
Brett Smith <brett@brettcsmith.org>
parents: 65
diff changeset
760 self.permanent_policy = EXTRACT_HERE
84
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
761 elif 'rename'.startswith(default):
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
762 self.permanent_policy = EXTRACT_RENAME
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
763 elif 'inside'.startswith(default):
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
764 self.permanent_policy = EXTRACT_WRAP
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
765 elif default is not None:
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
766 raise ValueError("bad value %s for default policy" % (default,))
66
af0b822b012e Don't prompt for one entry handling with -f.
Brett Smith <brett@brettcsmith.org>
parents: 65
diff changeset
767
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
768 def prep(self, archive_filename, extractor):
105
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
769 question = self.wrap(
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
770 "%s contains one %s but its name doesn't match.",
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
771 archive_filename, extractor.content_type)
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
772 question.append(" Expected: " + extractor.basename())
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
773 question.append(" Actual: " + extractor.content_name)
103
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
774 choice_vars = (extractor.content_type, extractor.basename())
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
775 self.choices = [text % choice_vars[:text.count('%s')]
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
776 for text in self.choice_template]
26
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
777 self.current_policy = (self.permanent_policy or
d660410455d9 [svn] Little DRY cleanups.
brett
parents: 25
diff changeset
778 self.ask_question(question))
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
779
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
780 def ok_for_match(self):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
781 return self.current_policy in (EXTRACT_RENAME, EXTRACT_HERE)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
782
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
783
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
784 class RecursionPolicy(BasePolicy):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
785 answers = {'o': RECURSE_ONCE, 'a': RECURSE_ALWAYS, 'n': RECURSE_NOT_NOW,
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
786 'v': RECURSE_NEVER, 'l': RECURSE_LIST, '': RECURSE_NOT_NOW}
103
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
787 choices = ["_A_lways extract included archives during this session",
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
788 "extract included archives this _O_nce",
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
789 "choose _N_ot to extract included archives this once",
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
790 "ne_V_er extract included archives during this session",
f68a0ca870b0 Reword one entry prompt; wrap prompt choices; better term size detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 101
diff changeset
791 "_L_ist included archives"]
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
792 prompt = "What do you want to do? (a/o/N/v/l) "
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
793
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
794 def __init__(self, options):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
795 BasePolicy.__init__(self, options)
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
796 if options.show_list:
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
797 self.permanent_policy = RECURSE_NEVER
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
798 elif options.recursive:
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
799 self.permanent_policy = RECURSE_ALWAYS
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
800
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
801 def prep(self, current_filename, target, extractor):
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
802 archive_count = len(extractor.included_archives)
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
803 if (self.permanent_policy is not None) or (archive_count == 0):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
804 self.current_policy = self.permanent_policy or RECURSE_NOT_NOW
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
805 return
105
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
806 question = self.wrap(
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
807 "%s contains %s other archive file(s), out of %s file(s) total.",
f76ac41fe061 Make sure prompts with filenames don't break mid-filename.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 103
diff changeset
808 current_filename, archive_count, extractor.file_count)
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
809 if target == '.':
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
810 target = ''
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
811 included_root = extractor.included_root
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
812 if included_root == './':
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
813 included_root = ''
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
814 while True:
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
815 self.current_policy = self.ask_question(question)
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
816 if self.current_policy != RECURSE_LIST:
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
817 break
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
818 print ("\n%s\n" %
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
819 '\n'.join([os.path.join(target, included_root, filename)
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
820 for filename in extractor.included_archives]))
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
821 if self.current_policy in (RECURSE_ALWAYS, RECURSE_NEVER):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
822 self.permanent_policy = self.current_policy
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
823
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
824 def ok_to_recurse(self):
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
825 return self.current_policy in (RECURSE_ALWAYS, RECURSE_ONCE)
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
826
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
827
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
828 class ExtractorBuilder(object):
81
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
829 extractor_map = {'tar': {'extractor': TarExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
830 'mimetypes': ('x-tar',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
831 'extensions': ('tar',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
832 'magic': ('POSIX tar archive',)},
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
833 'zip': {'extractor': ZipExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
834 'mimetypes': ('zip',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
835 'extensions': ('zip',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
836 'magic': ('(Zip|ZIP self-extracting) archive',)},
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
837 'rpm': {'extractor': RPMExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
838 'mimetypes': ('x-redhat-package-manager', 'x-rpm'),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
839 'extensions': ('rpm',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
840 'magic': ('RPM',)},
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
841 'deb': {'extractor': DebExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
842 'metadata': DebMetadataExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
843 'mimetypes': ('x-debian-package',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
844 'extensions': ('deb',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
845 'magic': ('Debian binary package',)},
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
846 'cpio': {'extractor': CpioExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
847 'mimetypes': ('x-cpio',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
848 'extensions': ('cpio',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
849 'magic': ('cpio archive',)},
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
850 'gem': {'extractor': GemExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
851 'metadata': GemMetadataExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
852 'mimetypes': ('x-ruby-gem',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
853 'extensions': ('gem',)},
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
854 '7z': {'extractor': SevenExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
855 'mimetypes': ('x-7z-compressed',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
856 'extensions': ('7z',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
857 'magic': ('7-zip archive',)},
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
858 'cab': {'extractor': CABExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
859 'mimetypes': ('x-cab',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
860 'extensions': ('cab',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
861 'magic': ('Microsoft Cabinet Archive',)},
92
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
862 'rar': {'extractor': RarExtractor,
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
863 'mimetypes': ('rar',),
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
864 'extensions': ('rar',),
d9319958bb5a Add RAR support. Thanks to Peter Kelemen for the patch.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 89
diff changeset
865 'magic': ('RAR archive',)},
81
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
866 'shield': {'extractor': ShieldExtractor,
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
867 'mimetypes': ('x-cab',),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
868 'extensions': ('cab', 'hdr'),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
869 'magic': ('InstallShield CAB',)},
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
870 'compress': {'extractor': CompressionExtractor}
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
871 }
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
872
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
873 mimetype_map = {}
81
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
874 magic_mime_map = {}
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
875 extension_map = {}
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
876 for ext_name, ext_info in extractor_map.items():
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
877 for mimetype in ext_info.get('mimetypes', ()):
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
878 if '/' not in mimetype:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
879 mimetype = 'application/' + mimetype
81
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
880 mimetype_map[mimetype] = ext_name
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
881 for magic_re in ext_info.get('magic', ()):
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
882 magic_mime_map[re.compile(magic_re)] = ext_name
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
883 for extension in ext_info.get('extensions', ()):
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
884 extension_map.setdefault(extension, []).append((ext_name, None))
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
885
115
d670445a0a9b Support more .tar.whatever extensions.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 114
diff changeset
886 for mapping in (('tar', 'bzip2', 'tar.bz2', 'tbz2', 'tb2', 'tbz'),
81
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
887 ('tar', 'gzip', 'tar.gz', 'tgz'),
111
09be2149c500 add support for LZMA-compressed tar file extensions
Brett Smith <brettcsmith@brettcsmith.org>
parents: 110
diff changeset
888 ('tar', 'lzma', 'tar.lzma', 'tlz'),
123
8570c14304bb add support for xz compression
Brett Smith <brettcsmith@brettcsmith.org>
parents: 121
diff changeset
889 ('tar', 'xz', 'tar.xz'),
115
d670445a0a9b Support more .tar.whatever extensions.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 114
diff changeset
890 ('tar', 'compress', 'tar.Z', 'taz'),
81
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
891 ('compress', 'gzip', 'Z', 'gz'),
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
892 ('compress', 'bzip2', 'bz2'),
123
8570c14304bb add support for xz compression
Brett Smith <brettcsmith@brettcsmith.org>
parents: 121
diff changeset
893 ('compress', 'lzma', 'lzma'),
8570c14304bb add support for xz compression
Brett Smith <brettcsmith@brettcsmith.org>
parents: 121
diff changeset
894 ('compress', 'xz', 'xz')):
81
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
895 for extension in mapping[2:]:
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
896 extension_map.setdefault(extension, []).append(mapping[:2])
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
897
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
898 magic_encoding_map = {}
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
899 for mapping in (('bzip2', 'bzip2 compressed'),
98
6b7860fca221 Add support for LZMA magic detection.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 96
diff changeset
900 ('gzip', 'gzip compressed'),
123
8570c14304bb add support for xz compression
Brett Smith <brettcsmith@brettcsmith.org>
parents: 121
diff changeset
901 ('lzma', 'LZMA compressed'),
8570c14304bb add support for xz compression
Brett Smith <brettcsmith@brettcsmith.org>
parents: 121
diff changeset
902 ('xz', 'xz compressed')):
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
903 for pattern in mapping[1:]:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
904 magic_encoding_map[re.compile(pattern)] = mapping[0]
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
905
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
906 def __init__(self, filename, options):
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
907 self.filename = filename
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
908 self.options = options
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
909
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
910 def build_extractor(self, archive_type, encoding):
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
911 extractors = self.extractor_map[archive_type]
81
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
912 if self.options.metadata and extractors.has_key('metadata'):
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
913 extractor = extractors['metadata']
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
914 else:
81
18f4fe62eff2 Move most ExtractorBuilder constants to the top.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 80
diff changeset
915 extractor = extractors['extractor']
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
916 return extractor(self.filename, encoding)
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
917
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
918 def get_extractor(self):
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
919 tried_types = set()
36
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
920 # As smart as it is, the magic test can't go first, because at least
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
921 # on my system it just recognizes gem files as tar files. I guess
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
922 # it's possible for the opposite problem to occur -- where the mimetype
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
923 # or extension suggests something less than ideal -- but it seems less
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
924 # likely so I'm sticking with this.
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
925 for func_name in ('mimetype', 'extension', 'magic'):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
926 logger.debug("getting extractors by %s" % (func_name,))
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
927 extractor_types = \
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
928 getattr(self, 'try_by_' + func_name)(self.filename)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
929 logger.debug("done getting extractors")
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
930 for ext_args in extractor_types:
36
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
931 if ext_args in tried_types:
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
932 continue
4bf2508d9b9e [svn] Small optimization to be nice to the system: don't try a given extractor
brett
parents: 35
diff changeset
933 tried_types.add(ext_args)
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
934 logger.debug("trying %s extractor from %s" %
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
935 (ext_args, func_name))
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
936 yield self.build_extractor(*ext_args)
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
937
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
938 def try_by_mimetype(cls, filename):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
939 mimetype, encoding = mimetypes.guess_type(filename)
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
940 try:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
941 return [(cls.mimetype_map[mimetype], encoding)]
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
942 except KeyError:
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
943 if encoding:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
944 return [('compress', encoding)]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
945 return []
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
946 try_by_mimetype = classmethod(try_by_mimetype)
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
947
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
948 def magic_map_matches(cls, output, magic_map):
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
949 return [result for regexp, result in magic_map.items()
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
950 if regexp.search(output)]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
951 magic_map_matches = classmethod(magic_map_matches)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
952
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
953 def try_by_magic(cls, filename):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
954 process = subprocess.Popen(['file', '-z', filename],
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
955 stdout=subprocess.PIPE)
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
956 status = process.wait()
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
957 if status != 0:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
958 return []
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
959 output = process.stdout.readline()
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
960 process.stdout.close()
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
961 if output.startswith('%s: ' % filename):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
962 output = output[len(filename) + 2:]
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
963 mimes = cls.magic_map_matches(output, cls.magic_mime_map)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
964 encodings = cls.magic_map_matches(output, cls.magic_encoding_map)
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
965 if mimes and not encodings:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
966 encodings = [None]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
967 elif encodings and not mimes:
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
968 mimes = ['compress']
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
969 return [(m, e) for m in mimes for e in encodings]
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
970 try_by_magic = classmethod(try_by_magic)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
971
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
972 def try_by_extension(cls, filename):
43
4591a32eedc8 [svn] Sadly Python 2.3 does not have an rsplit method on strings.
brett
parents: 42
diff changeset
973 parts = filename.split('.')[-2:]
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
974 results = []
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
975 while parts:
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
976 results.extend(cls.extension_map.get('.'.join(parts), []))
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
977 del parts[0]
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
978 return results
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
979 try_by_extension = classmethod(try_by_extension)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
980
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
981
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
982 class BaseAction(object):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
983 def __init__(self, options, filenames):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
984 self.options = options
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
985 self.filenames = filenames
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
986 self.target = None
106
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
987 self.do_print = False
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
988
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
989 def report(self, function, *args):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
990 try:
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
991 error = function(*args)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
992 except EXTRACTION_ERRORS, exception:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
993 error = str(exception)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
994 logger.debug(''.join(traceback.format_exception(*sys.exc_info())))
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
995 return error
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
996
106
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
997 def show_filename(self, filename):
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
998 if len(self.filenames) < 2:
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
999 return
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1000 elif self.do_print:
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1001 print
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1002 else:
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1003 self.do_print = True
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1004 print "%s:" % (filename,)
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1005
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1006
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1007 class ExtractionAction(BaseAction):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1008 handlers = [FlatHandler, OverwriteHandler, MatchHandler, EmptyHandler,
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1009 BombHandler]
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1010
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1011 def get_handler(self, extractor):
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
1012 if extractor.content_type in ONE_ENTRY_UNKNOWN:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1013 self.options.one_entry_policy.prep(self.current_filename,
65
0aea49161478 Make the wording on the One Entry question a little clearer.
Brett Smith <brett@brettcsmith.org>
parents: 62
diff changeset
1014 extractor)
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1015 for handler in self.handlers:
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1016 if handler.can_handle(extractor.content_type, self.options):
35
957b402d4b90 [svn] Add support for extracting CAB archives. Because the CAB archive I was
brett
parents: 34
diff changeset
1017 logger.debug("using %s handler" % (handler.__name__,))
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1018 self.current_handler = handler(extractor, self.options)
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1019 break
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1020
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1021 def show_extraction(self, extractor):
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1022 if self.options.log_level > logging.INFO:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1023 return
106
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1024 self.show_filename(self.current_filename)
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1025 if extractor.contents is None:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1026 print self.current_handler.target
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1027 return
55
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
1028 def reverser(x, y):
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
1029 return cmp(y, x)
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1030 if self.current_handler.target == '.':
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1031 filenames = extractor.contents
55
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
1032 filenames.sort(reverser)
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1033 else:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1034 filenames = [self.current_handler.target]
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1035 pathjoin = os.path.join
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1036 isdir = os.path.isdir
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1037 while filenames:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1038 filename = filenames.pop()
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1039 if isdir(filename):
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1040 print "%s/" % (filename,)
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1041 new_filenames = os.listdir(filename)
55
494516c027c4 [svn] Stupid Python 2.3 doesn't support [].sort(reverse=True).
brett
parents: 54
diff changeset
1042 new_filenames.sort(reverser)
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1043 filenames.extend([pathjoin(filename, new_filename)
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1044 for new_filename in new_filenames])
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1045 else:
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1046 print filename
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1047
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1048 def run(self, filename, extractor):
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1049 self.current_filename = filename
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1050 error = (self.report(extractor.extract) or
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1051 self.report(self.get_handler, extractor) or
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1052 self.report(self.current_handler.handle) or
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1053 self.report(self.show_extraction, extractor))
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1054 if not error:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1055 self.target = self.current_handler.target
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1056 return error
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1057
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1058
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1059 class ListAction(BaseAction):
106
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1060 def list_filenames(self, extractor, filename):
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1061 # We get a line first to make sure there's not going to be some
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1062 # basic error before we show what filename we're listing.
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1063 filename_lister = extractor.get_filenames()
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1064 try:
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1065 first_line = filename_lister.next()
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1066 except StopIteration:
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1067 self.show_filename(filename)
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1068 else:
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1069 self.did_list = True
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1070 self.show_filename(filename)
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1071 print first_line
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1072 for line in filename_lister:
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1073 print line
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1074
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1075 def run(self, filename, extractor):
106
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1076 self.did_list = False
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1077 error = self.report(self.list_filenames, extractor, filename)
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1078 if error and self.did_list:
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1079 logger.error("lister failed: ignore above listing for %s" %
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1080 (filename,))
dcf005ef7070 Start printing results ASAP with -l or -t.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 105
diff changeset
1081 return error
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
1082
125
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1083 class UrlHandler(urllib.FancyURLopener):
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1084 def http_error_default(self, url, fp, errcode, errmsg, headers):
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1085 urllib.URLopener.http_error_default(self, url, fp, errcode, errmsg, headers)
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1086
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1087 def is_url(self, url):
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1088 if url.startswith("http://"):
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1089 return True
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1090
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1091 def fetch(self, url):
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1092 i = url.rfind('/')
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1093 filename = url[i+1:]
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1094 try:
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1095 self.retrieve(url, filename)
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1096 except IOError:
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1097 return False, "Failed to fetch "+url
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1098 return True, filename
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1099
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
1100
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1101 class ExtractorApplication(object):
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1102 def __init__(self, arguments):
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1103 for signal_num in (signal.SIGINT, signal.SIGTERM):
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1104 signal.signal(signal_num, self.abort)
85
ad73f75c9046 Use the default action for SIGPIPE.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 84
diff changeset
1105 signal.signal(signal.SIGPIPE, signal.SIG_DFL)
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1106 self.parse_options(arguments)
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
1107 self.setup_logger()
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1108 self.successes = []
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1109 self.failures = []
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1110
77
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1111 def clean_destination(self, dest_name):
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1112 try:
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1113 os.unlink(dest_name)
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1114 except OSError, error:
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1115 if error.errno == errno.EISDIR:
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1116 shutil.rmtree(dest_name, ignore_errors=True)
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1117
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1118 def abort(self, signal_num, frame):
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1119 signal.signal(signal_num, signal.SIG_IGN)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1120 print
49
c76dd2716113 [svn] Add a traceback when we catch a signal.
brett
parents: 48
diff changeset
1121 logger.debug("traceback:\n" +
c76dd2716113 [svn] Add a traceback when we catch a signal.
brett
parents: 48
diff changeset
1122 ''.join(traceback.format_stack(frame)).rstrip())
86
e02ca4e9bf42 Be more careful on SIGINT/SIGKILL cleanup.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 85
diff changeset
1123 logger.debug("got signal %s" % (signal_num,))
e02ca4e9bf42 Be more careful on SIGINT/SIGKILL cleanup.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 85
diff changeset
1124 try:
e02ca4e9bf42 Be more careful on SIGINT/SIGKILL cleanup.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 85
diff changeset
1125 basename = self.current_extractor.target
e02ca4e9bf42 Be more careful on SIGINT/SIGKILL cleanup.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 85
diff changeset
1126 except AttributeError:
e02ca4e9bf42 Be more careful on SIGINT/SIGKILL cleanup.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 85
diff changeset
1127 basename = None
e02ca4e9bf42 Be more careful on SIGINT/SIGKILL cleanup.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 85
diff changeset
1128 if basename is not None:
e02ca4e9bf42 Be more careful on SIGINT/SIGKILL cleanup.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 85
diff changeset
1129 logger.debug("cleaning up %s" % (basename,))
e02ca4e9bf42 Be more careful on SIGINT/SIGKILL cleanup.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 85
diff changeset
1130 clean_targets = set([os.path.realpath('.')])
e02ca4e9bf42 Be more careful on SIGINT/SIGKILL cleanup.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 85
diff changeset
1131 if hasattr(self, 'current_directory'):
e02ca4e9bf42 Be more careful on SIGINT/SIGKILL cleanup.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 85
diff changeset
1132 clean_targets.add(os.path.realpath(self.current_directory))
e02ca4e9bf42 Be more careful on SIGINT/SIGKILL cleanup.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 85
diff changeset
1133 for directory in clean_targets:
e02ca4e9bf42 Be more careful on SIGINT/SIGKILL cleanup.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 85
diff changeset
1134 self.clean_destination(os.path.join(directory, basename))
48
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1135 sys.exit(1)
0a0eeeb5b97d [svn] Handle SIGINT and SIGKILL.
brett
parents: 47
diff changeset
1136
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1137 def parse_options(self, arguments):
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1138 parser = optparse.OptionParser(
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1139 usage="%prog [options] archive [archive2 ...]",
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1140 description="Intelligent archive extractor",
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1141 version=VERSION_BANNER
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1142 )
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1143 parser.add_option('-l', '-t', '--list', '--table', dest='show_list',
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1144 action='store_true', default=False,
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1145 help="list contents of archives on standard output")
84
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1146 parser.add_option('-m', '--metadata', dest='metadata',
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1147 action='store_true', default=False,
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1148 help="extract metadata from a .deb/.gem")
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1149 parser.add_option('-r', '--recursive', dest='recursive',
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1150 action='store_true', default=False,
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1151 help="extract archives contained in the ones listed")
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1152 parser.add_option('--one', '--one-entry', dest='one_entry_default',
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1153 default=None,
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1154 help=("specify extraction policy for one-entry " +
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1155 "archives: inside/rename/here"))
20
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
1156 parser.add_option('-n', '--noninteractive', dest='batch',
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
1157 action='store_true', default=False,
69c93c3e6972 [svn] If the archive contains one directory with the "wrong" name, ask the user
brett
parents: 19
diff changeset
1158 help="don't ask how to handle special cases")
84
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1159 parser.add_option('-o', '--overwrite', dest='overwrite',
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1160 action='store_true', default=False,
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1161 help="overwrite any existing target output")
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1162 parser.add_option('-f', '--flat', '--no-directory', dest='flat',
29
5fad99c17221 [svn] Add support for Ruby Gems, and extracting metadata from .deb/.gem files.
brett
parents: 28
diff changeset
1163 action='store_true', default=False,
84
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1164 help="extract everything to the current directory")
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1165 parser.add_option('-v', '--verbose', dest='verbose',
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1166 action='count', default=0,
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1167 help="be verbose/print debugging information")
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1168 parser.add_option('-q', '--quiet', dest='quiet',
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1169 action='count', default=3,
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1170 help="suppress warning/error messages")
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1171 self.options, filenames = parser.parse_args(arguments)
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1172 if not filenames:
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1173 parser.error("you did not list any archives")
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1174 # This makes WARNING is the default.
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1175 self.options.log_level = (10 * (self.options.quiet -
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1176 self.options.verbose))
84
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1177 try:
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1178 self.options.one_entry_policy = OneEntryPolicy(self.options)
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1179 except ValueError:
d78d63cb4c4e Add --one-entry option to specify default handling for one-entry archives.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 83
diff changeset
1180 parser.error("invalid value for --one-entry option")
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
1181 self.options.recursion_policy = RecursionPolicy(self.options)
6
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1182 self.archives = {os.path.realpath(os.curdir): filenames}
77043f4e6a9f [svn] The big thing here is recursive extraction. Find archive files in the
brett
parents: 5
diff changeset
1183
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
1184 def setup_logger(self):
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1185 logging.getLogger().setLevel(self.options.log_level)
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
1186 handler = logging.StreamHandler()
52
cf191f957fd0 [svn] Make just one -v print a list of filenames, a la tar.
brett
parents: 51
diff changeset
1187 handler.setLevel(self.options.log_level)
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1188 formatter = logging.Formatter("dtrx: %(levelname)s: %(message)s")
12
5d202467c589 [svn] Introduce a real logging system. Right now all this really gets us is the
brett
parents: 11
diff changeset
1189 handler.setFormatter(formatter)
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1190 logger.addHandler(handler)
33
3547e3124729 [svn] Fix some bugs and make things a little more user-friendly now that we can
brett
parents: 32
diff changeset
1191 logger.debug("logger is set up")
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1192
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1193 def recurse(self, filename, extractor, action):
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
1194 self.options.recursion_policy.prep(filename, action.target, extractor)
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
1195 if self.options.recursion_policy.ok_to_recurse():
53
cd853ddb224c [svn] Add interactive option to list recursive archives when found.
brett
parents: 52
diff changeset
1196 for filename in extractor.included_archives:
71
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1197 logger.debug("recursing with %s archive" %
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1198 (extractor.content_type,))
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
1199 tail_path, basename = os.path.split(filename)
71
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1200 path_args = [self.current_directory, extractor.included_root,
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1201 tail_path]
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1202 logger.debug("included root: %s" % (extractor.included_root,))
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1203 logger.debug("tail path: %s" % (tail_path,))
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1204 if os.path.isdir(action.target):
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1205 logger.debug("action target: %s" % (action.target,))
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1206 path_args.insert(1, action.target)
b0290eeb3b7a Recurse better when the contents were just one file.
Brett Smith <brett@brettcsmith.org>
parents: 70
diff changeset
1207 directory = os.path.join(*path_args)
25
ef62f2f55eb8 [svn] Move policy-handling code into a dedicated set of classes. This makes
brett
parents: 23
diff changeset
1208 self.archives.setdefault(directory, []).append(basename)
8
97388f5ff770 [svn] Make ExtractorApplication suck less. Now the strategies for handling
brett
parents: 7
diff changeset
1209
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1210 def check_file(self, filename):
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1211 try:
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1212 result = os.stat(filename)
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1213 except OSError, error:
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1214 return error.strerror
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1215 if stat.S_ISDIR(result.st_mode):
79
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
1216 return "cannot work with a directory"
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1217
78
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
1218 def show_stderr(self, logger_func, stderr):
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
1219 if stderr:
79
9c0cc7aef510 Improve dtrx -l performance on misnamed files, and clean other error messages.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 78
diff changeset
1220 logger_func("Error output from this process:\n" +
78
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
1221 stderr.rstrip('\n'))
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
1222
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1223 def try_extractors(self, filename, builder):
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1224 errors = []
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1225 for extractor in builder:
86
e02ca4e9bf42 Be more careful on SIGINT/SIGKILL cleanup.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 85
diff changeset
1226 self.current_extractor = extractor # For the abort() method.
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1227 error = self.action.run(filename, extractor)
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1228 if error:
78
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
1229 errors.append((extractor.file_type, extractor.encoding, error,
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
1230 extractor.get_stderr()))
77
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1231 if extractor.target is not None:
3a1f49be7667 Clean the target directory if an extraction attempt failed.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 76
diff changeset
1232 self.clean_destination(extractor.target)
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1233 else:
78
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
1234 self.show_stderr(logger.warn, extractor.get_stderr())
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1235 self.recurse(filename, extractor, self.action)
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1236 return
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1237 logger.error("could not handle %s" % (filename,))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1238 if not errors:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1239 logger.error("not a known archive type")
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1240 return True
78
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
1241 for file_type, encoding, error, stderr in errors:
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1242 message = ["treating as", file_type, "failed:", error]
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1243 if encoding:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1244 message.insert(1, "%s-encoded" % (encoding,))
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1245 logger.error(' '.join(message))
78
978307ec7d11 Don't show errors from failed extractors unless they all fail.
Brett Smith <brettcsmith@brettcsmith.org>
parents: 77
diff changeset
1246 self.show_stderr(logger.error, stderr)
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1247 return True
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1248
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1249 def run(self):
125
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1250 urlhandler = UrlHandler();
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1251 if self.options.show_list:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1252 action = ListAction
19
bb6e9f4af1a5 [svn] Rename the program to dtrx.
brett
parents: 18
diff changeset
1253 else:
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1254 action = ExtractionAction
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1255 self.action = action(self.options, self.archives.values()[0])
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1256 while self.archives:
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1257 self.current_directory, self.filenames = self.archives.popitem()
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1258 os.chdir(self.current_directory)
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1259 for filename in self.filenames:
125
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1260 if urlhandler.is_url(filename):
c4495fc7d00d Add support for fetching archives from URLs
Matthew Wild <mwild1@gmail.com>
parents: 123
diff changeset
1261 error, filename = urlhandler.fetch(filename)
31
c3a2760d1c3a [svn] Refactor actions (extract the archive, vs. list the contents) into their
brett
parents: 30
diff changeset
1262 builder = ExtractorBuilder(filename, self.options)
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1263 error = (self.check_file(filename) or
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1264 self.try_extractors(filename, builder.get_extractor()))
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1265 if error:
45
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1266 if error != True:
37d555407334 [svn] At work I was getting an unhelpful "No such file or directory" error when I
brett
parents: 43
diff changeset
1267 logger.error("%s: %s" % (filename, error))
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1268 self.failures.append(filename)
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1269 else:
39
027fcd7ae002 [svn] Improve the error reporting to be more user-friendly, at least in many of
brett
parents: 36
diff changeset
1270 self.successes.append(filename)
30
1015bbd6dc5e [svn] If we can't figure out what the file is by mimetype, try using the file
brett
parents: 29
diff changeset
1271 self.options.one_entry_policy.permanent_policy = EXTRACT_WRAP
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1272 if self.failures:
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1273 return 1
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1274 return 0
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1275
1
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1276
a86a0cb0dd57 [svn] Repository reorganization to make tags easy
brett
parents:
diff changeset
1277 if __name__ == '__main__':
5
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1278 app = ExtractorApplication(sys.argv[1:])
36f352abd093 [svn] Deal with a bunch of low-hanging fruit:
brett
parents: 2
diff changeset
1279 sys.exit(app.run())

mercurial