Improve dtrx -l performance on misnamed files, and clean other error messages. trunk

Sun, 20 Jul 2008 21:16:08 -0400

author
Brett Smith <brettcsmith@brettcsmith.org>
date
Sun, 20 Jul 2008 21:16:08 -0400
branch
trunk
changeset 79
9c0cc7aef510
parent 78
978307ec7d11
child 80
df9b3428e28f

Improve dtrx -l performance on misnamed files, and clean other error messages.

dtrx -l would only ever try one extractor, instead of trying all possible
alternaitves like it did when extracting. This is mostly because
get_filenames() reported no meaningful error information. This has been
fixed.

Some error messages have been genericized, since we're not always
extracting archives: we might be listing their contents instead.

TODO file | annotate | diff | comparison | revisions
scripts/dtrx file | annotate | diff | comparison | revisions
tests/tests.yml file | annotate | diff | comparison | revisions
--- a/TODO	Sun Jul 20 20:45:54 2008 -0400
+++ b/TODO	Sun Jul 20 21:16:08 2008 -0400
@@ -2,17 +2,8 @@
 
 To do:
 
-* Add a test for the case where you x -l a misnamed file.
-
-* If any of the extraction processes succeeds, only show stderr for that
-  one.
-
 * Investigate the right way to handle SIGPIPE and do so.
 
-* When we extract a compressed file (or just one file?), check to see if it
-  itself is an archive.  Follow all the usual rules for recursive
-  extraction when we do this.
-
 * --expert mode: prompts don't show an explanation of what the options are,
   unless you ask with ?.
 
--- a/scripts/dtrx	Sun Jul 20 20:45:54 2008 -0400
+++ b/scripts/dtrx	Sun Jul 20 21:16:08 2008 -0400
@@ -278,6 +278,7 @@
 
     def get_filenames(self):
         self.run_pipes()
+        self.check_success(False)
         self.archive.seek(0, 0)
         while True:
             line = self.archive.readline()
@@ -299,6 +300,15 @@
         return '.'.join(pieces)
 
     def get_filenames(self):
+        # This code used to just immediately yield the basename, under the
+        # assumption that that would be the filename.  However, if that
+        # happens, dtrx -l will report this as a valid result for files with
+        # compression extensions, even if those files shouldn't actually be
+        # handled this way.  So, we call out to the file command to do a quick
+        # check and make sure this actually looks like a compressed file.
+        if 'compress' not in [match[0] for match in
+                              ExtractorBuilder.try_by_magic(self.filename)]:
+            raise ExtractorError("doesn't look like a compressed file")
         yield self.basename()
 
     def extract(self):
@@ -454,10 +464,8 @@
 
     def get_filenames(self):
         self.pipe(['7z', 'l', self.filename], "listing")
-        self.run_pipes()
-        self.archive.seek(0, 0)
         fn_index = None
-        for line in self.archive:
+        for line in NoPipeExtractor.get_filenames(self):
             if self.border_re.match(line):
                 if fn_index is not None:
                     break
@@ -478,13 +486,12 @@
 
     def get_filenames(self):
         self.pipe(['cabextract', '-l', self.filename], "listing")
-        self.run_pipes()
-        self.archive.seek(0, 0)
         fn_index = None
-        for line in self.archive:
+        filenames = NoPipeExtractor.get_filenames(self)
+        for line in filenames:
             if self.border_re.match(line):
                 break
-        for line in self.archive:
+        for line in filenames:
             try:
                 yield line.split(' | ', 2)[2].rstrip('\n')
             except IndexError:
@@ -503,9 +510,7 @@
 
     def get_filenames(self):
         self.pipe(['unshield', 'l', self.filename], "listing")
-        self.run_pipes()
-        self.archive.seek(0, 0)
-        for line in self.archive:
+        for line in NoPipeExtractor.get_filenames(self):
             if self.end_re.match(line):
                 break
             else:
@@ -1082,11 +1087,11 @@
         except OSError, error:
             return error.strerror
         if stat.S_ISDIR(result.st_mode):
-            return "cannot extract a directory"
+            return "cannot work with a directory"
 
     def show_stderr(self, logger_func, stderr):
         if stderr:
-            logger_func("Error output from the extraction process:\n" +
+            logger_func("Error output from this process:\n" +
                         stderr.rstrip('\n'))
 
     def try_extractors(self, filename, builder):
--- a/tests/tests.yml	Sun Jul 20 20:45:54 2008 -0400
+++ b/tests/tests.yml	Sun Jul 20 21:16:08 2008 -0400
@@ -477,6 +477,14 @@
     cd trickery
     unzip -q ../$1
 
+- name: listing file with misleading extension
+  options: -l
+  filenames: trickery.tar.gz
+  prerun: cp ${1}test-1.23.zip ${1}trickery.tar.gz
+  cleanup: rm -f ${1}trickery.tar.gz
+  grep: "^1/2/3$"
+  antigrep: "^dtrx:"
+
 - name: non-archive error
   filenames: /dev/null
   error: true
@@ -496,7 +504,7 @@
   filenames: test-directory
   prerun: mkdir test-directory
   error: true
-  grep: "cannot extract a directory"
+  grep: "cannot work with a directory"
 
 - name: permission denied error
   filenames: unreadable-file.tar.gz

mercurial