Browse Source

change the methods so they all throw exceptions
change the shell code to catch these exceptions
shell version 1.2
change the quoted-string parser to simplify

Pat Beirne 3 years ago
parent
commit
ddbea490ea
2 changed files with 131 additions and 129 deletions
  1. 36 22
      README.md
  2. 95 107
      tf.py

+ 36 - 22
README.md

@@ -1,6 +1,6 @@
 # TF
 
-A module for manipulating files in the *MicroPython* environment.  
+A module for manipulating **T**ext **F**iles in the *MicroPython* environment.  
 
 [TOC]
 
@@ -8,8 +8,8 @@ A module for manipulating files in the *MicroPython* environment.
 
 I discovered *MicroPython* when working on the ESP8266 processor. Everything seemed very nice, except it was awkward moving files around. All the methods I could find required a back-and-forth with the programmer's desktop.
 
-This **TF** module includes functions for creating, searching, editing and making backups of local files, using only the embedded processor. The module itself is small (about 7k) and can be downloaded into the target machine. Once there, the user can invoke it by either calling functions, or using the builtin command line. 
-
+This **TF** module includes functions for creating, searching, editing and making backups of local text files, using only the embedded processor. The module itself is small (about 7k) and can be downloaded into the target machine. Once there, the user can invoke it by either calling functions, or using the builtin command line. 
+```
 For example, to make a backup, you can call  
 
 ```
@@ -19,7 +19,7 @@ For example, to make a backup, you can call
 or you can use the builtin command line and
 
 ```
-/$ cp m.log.bak mail.log
+/$ cp mail.log m.log.bak
 /$ dir
 -rwx all       230 boot.py
 -rwx all      2886 m.log.bak
@@ -49,9 +49,9 @@ disk size:     392 KB   disk free: 196 KB
 /$ 
 ```
 
-The first half of the **TF** module holds the functions. These may come in handy for parsing files, making backups or searching through files. 
+The first half of the **TF** module holds the functions. With these you can  parse or search files, or make backups. 
 
-The second half contains the simple command shell. This may come in handy for testing the functions, experimenting with their functions, or if you, like me, like to play around with a live system. If you don't need the shell, just delete everything from `def _help():` downward.
+The second half contains the simple command shell. This may come in handy for testing the functions, experimenting with how they work, or if you, like me, enjoy playing around with a live system. [If you don't need the shell, just delete everything from `-----cut here` downward.]
 
 ## Functions
 
@@ -69,11 +69,12 @@ tf.cp('log.txt','log.bak')`
    in: src-filename    file to read
        dest-filename   file to write
    returns: Null
+   except: OSError if src or dest file cannot be found/created
 ```
 
-Simply copies a source file to a destination file. Filenames may include folders or . or .. prefixes. The destination is overwritten if it exists. This function reads-&-writes one line at a time, so it can handle megabyte files. Typical speeds are 100kB/sec on an ESP8266. 
+Simply copies a source file to a destination file. Filenames may include folders or . or .. prefixes; use `/` to separate folder+filename. The destination is overwritten if it exists. This function reads-&-writes one line at a time, so it can handle megabyte files. Typical speeds are 100kB/sec on an ESP8266. 
 
-**NOTE** this function *only works on text files*. Line lengths of up to 4096 work fine on the ESP8266.
+**NOTE** this function *only works on text files* delimited by `\n`. Line lengths of up to 4096 work fine on the ESP8266.
 
 #### cat()
 
@@ -85,6 +86,7 @@ Simply copies a source file to a destination file. Filenames may include folders
         numbers     whether to prepend each line with line-number + space
         title       whether to prepend the listing with the filename
     return: Null
+    except: OSError if file cannot be found
 ```
 
 Displays the source file on the screen.  You can specify a line range, and whether line numbers are displayed, and whether to put a *title line* on the output display.  
@@ -95,6 +97,7 @@ Displays the source file on the screen.  You can specify a line range, and wheth
     dir(directory-name='')
     in:     directory-name     defaults to current directory
     return: Null
+    except: OSError if directory doesn't exist
 ```
 
 Displays the contents of the current working directory. Files and folders are marked; ownership is assumed to be `all` and all are assumed to be `rwx` (read+write+execute). The file size is also shown and the disk size summary is shown at the bottom. 
@@ -111,17 +114,19 @@ NOTE: the name is `_dir()` because `dir()` is a python builtin.
          pattern          a python regex to match
          numbers          whether to prepend a line-number + space 
     return: Null
+    except: ValueError if the pattern fails to compile as reg-ex
+        OSError if the file cannot be found
+        RunTimeError if the reg-ex parsing uses up all the memory
 ```
 
-You can search a file for a pattern, and any matching lines are displayed. 
-
-Searches using ^ (start of line) work fine, but searches with $ (end-of-line) aren't currently working.
+You can search a file for a pattern, and any matching lines are displayed. Searches are restricted to within a line, don't bother with `\r` and `\n` searches. 
 
+ 
 ###### Examples
 
 ```
 tf.grep('log.txt', '2021-03-\d\d')
-tf.grep('config.txt', 'user.\s=')
+tf.grep('config.txt', '^user\s*=')
 tf.grep('config.ini', '\[\w*\]', numbers = True)
 ```
 
@@ -131,11 +136,14 @@ tf.grep('config.ini', '\[\w*\]', numbers = True)
   sed(filename, pattern, bak_ext=".bak")
   in:  filename       the file to edit
        pattern        a sed pattern, involving one of "aidsxX"
-       bak_ext        the extension to use when creating the file backup (without the dot)
+       bak_ext        the extension to use when creating the file backup (with the dot)
   return: tuple(number of lines in the input file, number of lines modified/added/deleted/matched)
+  except: OSError     the file cannot be found, or the backup cannot be created
+     ValueError       the reg-ex pattern fails to compile
+     RunTimeError     the reg-ex parsing uses up all the memory
 ```
 
-The *sed* function is an inline file editor, based on `sed` from the Unix world. When invoked, it first renames the source file to have a `.bak` extension. That file is opened and each line of the source file is loaded in, and a regex pattern match is performed. If the line is changed/found/inserted, then the output is streamed to the new (output) file with the same name as the original; it appears to the user that the files is edited-in-place, with a .bak file created. 
+The *sed()* function is an inline file editor, based on `sed` from the Unix world. When invoked, it first renames the source file to have a `.bak` extension. That file is opened and each line of the source file is loaded in, and a regex pattern match is performed. If the line is changed/found/inserted, then the output is streamed to the new (output) file with the same name as the original; it appears to the user that the files is edited-in-place, with a .bak file created. 
 
 This version of `sed` has 6 commands:
 
@@ -146,7 +154,7 @@ This version of `sed` has 6 commands:
 * x does a grep and only saves lines that match
 * X does a grep and only saves lines that do not match
 
-If the single-letter command is preceded by a number or number-range, then the edit operation only applies to that line(s). A number range may be separated by `-` hyphen or `,` comma.
+If the single-letter command is preceded by a number or number-range, then the edit operation only applies to that line(s). A number range may be separated by `-` hyphen or `,` comma. Use `$` to indicate end-of-file.
 
 ##### Examples
 
@@ -159,7 +167,9 @@ If the single-letter command is preceded by a number or number-range, then the e
                                  with two-# and two-spaces....to align some comments
 ```
 
-The x/X patterns are wrapped in a pair of delimiter characters, typically /, although any other character is allowed (except space). Valid X commands are:
+The `i/a/d` commands should be preceeded by a line number, or range; `sed()` will *insert*, *append* or *delete* once for each line in the range. 
+ 
+The ``x/X` patterns are wrapped in a pair of delimiter characters, typically /, although any other character is allowed (except space or any of `\^$()[]`). Valid X commands are:
 
 ```
 x/abcd/
@@ -167,12 +177,12 @@ x/abcd/
 x!ratio x/y!
 ```
 
-Similarly, the s patterns are wrapped in a triplet of delimiter characters, typcially / also. Valid 's' commands are
+Similarly, the s patterns are wrapped in a triplet of delimiter characters, typcially / also. If the search pattern has `()` groups, the replace pattern can refer to them with ``\1 \2`,etc. Valid 's' commands are
 
 ```
 s/toronto/Toronto/
 s/thier/their/
-10-120s/while\s(True|False)/while 1/
+120-$s/while\s*(True|False)/while 1/
 s@ratio\s*=\s*num/denom@ratio = num/denom if denom else 0@
 ```
 
@@ -180,6 +190,8 @@ s@ratio\s*=\s*num/denom@ratio = num/denom if denom else 0@
 
 **Note**: You will need some free space on your disk, the same size as the source file, as a backup file is *always* made. To edit an 800k file, you should have 800k of free space.
 
+**Note**: On error, the functions above throw exceptions. The simple shell below catches the exceptions. If you use the functions above, wrap them up in `try/except`.
+
 **Note**: The functions for
 
 * file delete (`rm, del`)
@@ -209,7 +221,8 @@ rmdir <dirname>
 help
 ```
 
-You can also use `copy`, `move`, `del`, `list` and `ls` as synonyms for `cp`, `mv`, `rm`, `cat` and `dir` .  The `mv` can rename directories. 
+You can also use `copy`, `move`, `del`, `list` and `ls` as synonyms for
+ `cp`, `mv`, `rm`, `cat` and `dir` .  The `mv` can rename directories. 
 
 For the `cat/list` command, you can enable line numbers with `-n` and you can limit the display range with `-l n-m` where `n` and `m` are decimal numbers (and n should be less than m). These are all valid uses of `cat`
 
@@ -220,7 +233,7 @@ cat -l 223-239 log.txt     # 17 lines
 cat -l244-$ log.txt        # from 244 to the end
 ```
 
-For `grep`  and `sed`, the patterns are *MicroPython* regular explressions, from the `re` module. If a pattern has a space character in it, then the pattern must be wrapped in  single-quote ' characters; patterns without an embedded space char can simply be typed. [The line parser is basically a `str.split()` unless a leading ' is detected.] To include a single quote in a quoted-pattern, you can escape it with \ .
+For `grep`  and `sed`, the patterns are *MicroPython* regular explressions, from the `re` module. If a pattern has a space character in it, then the pattern **must** be wrapped in  single-quote ' characters; patterns without an embedded space char can simply be typed. [The line parser is basically a `str.split()` unless a leading ' is detected.] To include a single quote in a quoted-pattern, you can escape it with ``\'` .
 
 Here are some valid uses of `sed` and `grep`
 
@@ -251,9 +264,10 @@ In its present form, the module has these limitations:
   * the target of `cp`and `mv` *cannot* be a simple a directory-name as in Linux; write the whole filename *w.r.t,* the current directory
 * the complexity of pattern matching is limited. 
   * try to format the grep patterns so they avoid deep stack recursion. For example, '([^#]|\\#)\*' has a very generous search term as the first half, and can cause deep-stack recursion. The equivalent '(\\#|[^#]\*)' is more likely to succeed.
-* with sed, lines are parsed and saved one-line-at-a-time, so pattern matching to \n and \r does not work
+* with sed, lines are parsed and saved one-line-at-a-time, so pattern matching to \n and \r does not work; sed cannot work over line boundaries
 * this simple shell is different than [mpfshell](https://github.com/wendlers/mpfshell) in that this shell runs entirely on the target device. There is no allowance in this shell for transferring files in/out of the target.
-
+* after a restart of your *MicroPython* board, you can invoke the shell with `import tf`; if you `^C` out of the shell, the second invocation of `tf` will have to be `import tf` followed by `tf.main()`, since the python interpreter caches the module and only loads it once per restart; you can intentionally restart the REPL prompt by hitting `^D` 
+ 
 ## Examples
 
 Make a simple change to a source file, perhaps modify a constant.

+ 95 - 107
tf.py

@@ -16,77 +16,61 @@ import re,os,sys,gc
 def transfer(src,dest,first=1,last=0xFFFFFFFF,numbers=False,grep_func=None):
   #src is a filename, dst is a handle
   i=0
-  try:
-    with open(src) as f:
-      for lin in f:
-        i=i+1
-        if i<first or i>last:
-          continue
-        if grep_func and not grep_func(lin):
-          continue
-        if numbers:
-          dest.write(str(i)+' ')
-        dest.write(lin)
-  except:
-    print("could not open file {}".format(src))
+  with open(src) as f:
+    for lin in f:
+      i=i+1
+      if i<first or i>last:
+        continue
+      if grep_func and not grep_func(lin):
+        continue
+      if numbers:
+        dest.write(str(i)+' ')
+      dest.write(lin)
 
 def cp(src,dest):
-  try:
-    with open(dest,'w') as g:
-      transfer(src,g)
-  except:
-    print("could not write to file {}".format(dest))
-
-def grep(filename, pattern, numbers=False):
-  m=re.compile(pattern)
-  if not m:
-    print("grep() called with invalid pattern")
-    return 
-  transfer(filename,sys.stdout,numbers=numbers,grep_func=(lambda x:m.search(x)))
+  with open(dest,'w') as g:
+    transfer(src,g)
 
 def cat(filename, first=1, last=1000000, numbers=False, title=True):
   if title:
-    print("===={}=====".format(filename))
+    print("===={}====".format(filename))
   transfer(filename,sys.stdout,first,last,numbers=numbers)
 
+def grep(filename, pattern, numbers=False):
+  m=re.compile(pattern)
+  transfer(filename,sys.stdout,numbers=numbers,grep_func=(lambda x:m.search(x[:-1])))
+
 def sed(filename, sed_cmd, bak_ext=".bak"):
   # parse the sed_cmd
   # group 1,3 are the n-start, n-end    group 4 is command: aidsxX
   a=re.search("^(\d*)([,-](\d+|\$))?\s*([sdaixX].*)",sed_cmd)
   if not a:
-    print("sed() failed; 2nd argument must be a number-range followed by one of sdaixX; no changes applied")
-    return
+    raise ValueError("sed() failed; pattern must be a number-range followed by one of sdaixX; no changes applied")
   cmd=a.group(4)
 
-  s,e=(1,1000000)
+  s,e=1,1000000
   if a.group(1):
     s=e=int(a.group(1))
   if a.group(3):
     e=1000000 if a.group(3)=='$' else int(a.group(3))
 
   op=cmd[0]
-  if op not in "sdiaxX":
-    print("sed requires an operation, one of 's,d,i,a,x or X'")
-    return
-  #print("sed command parser of <{}> returned {} {} {} {}".format(cmd,sr,de,ins,add))
+  if op in "aid" and e-s==1000000:
+    raise ValueError("sed(a/i/d) should have a line number")
+  #print("sed command parser of <{}> returned {} {} {}".format(op,cmd,a.group(1),a.group(3)))
   if op in "sxX":
-    if len(cmd)<2: 
-      print("invalid sed argument")
-      return
+    if len(cmd)<2 or cmd[1] in "\^$()[]": 
+      raise ValueError("invalid sed argument: {}".format(cmd))
     dl=cmd[1]
     if op=='s':
       gs=re.search("s"+dl+"([^"+dl+"]*)"+dl+"([^"+dl+"]*)"+dl,cmd)
     else:
       gs=re.search("[xX]"+dl+"([^"+dl+"]*)"+dl,cmd)
     if not gs:
-      print("invalid sed search pattern")
-      return 0,0
+      raise ValueError("invalid sed search pattern: {}".format(cmd))
+    ss=gs.group(1)
     if op=='s':
-      ss,r = gs.group(1),gs.group(2)
-      #print("search <{}> and replace <{}>".format(s,r))  
-    else:
-      ss=gs.group(1) 
-      #print("search <{}>".format(s))  
+      r=gs.group(2)
     sp=re.compile(ss) 
 
   extra=a.group(4)[1:] + '\n' 
@@ -94,50 +78,41 @@ def sed(filename, sed_cmd, bak_ext=".bak"):
   try:
     os.rename(filename,filename+bak_ext)
   except:
-    print("problem with filename; backup failed; no changes made")
-    return
+    raise OSError("problem with filename",filename,"   backup failed; no changes made") 
 
   i=h=0
-  try: 
-    with open(filename+bak_ext) as f:
-      with open(filename,'w') as g:
-        for lin in f:
-          i=i+1
-          m=(i>=s and i<=e)
-          if op=='s' and m:
-            lin=lin[:-1]
-            if sp.search(lin): h+=1
-            lin=sp.sub(r,lin)+'\n'
-          if op=='d' and m:
-            h+=1
-            continue   # delete line
-          if op=='i' and m:
-            #print("insert a line before {} <{}>".format(i,extra))
-            g.write(extra)
-            h+=1
-          if op in "aids":
-            g.write(lin)
-          elif m and (op=='x' if sp.search(lin) else op=='X'):
-            g.write(lin)
-            h+=1
-          if op=='a' and m:
-            #print("append a line after {} <{}>".format(i,extra))       
-            g.write(extra)
-            h+=1
-        #f.write("--file modifed by sed()--\n")
-  except OSError:
-    print("problem opening file {}".format(filename))
-  except RuntimeError:
-    print("problem with the regex; try a different pattern")
+  with open(filename+bak_ext) as f:
+    with open(filename,'w') as g:
+      for lin in f:
+        i=i+1
+        m=(i>=s and i<=e)
+        if op=='s' and m:
+          lin=lin[:-1]
+          if sp.search(lin): h+=1
+          lin=sp.sub(r,lin)+'\n'
+        if op=='d' and m:
+          h+=1
+          continue   # delete line
+        if op=='i' and m:
+          #print("insert a line before {} <{}>".format(i,extra))
+          g.write(extra)
+          h+=1
+        if op in "aids":
+          g.write(lin)
+        elif m and (op=='x' if sp.search(lin) else op=='X'):
+          g.write(lin)
+          h+=1
+        if op=='a' and m:
+          #print("append a line after {} <{}>".format(i,extra))       
+          g.write(extra)
+          h+=1
+      #f.write("--file modifed by sed()--\n")
   return (i, h)
 
 def _dir(d='.'):
-  try:  
-    for f in os.listdir(d):
-      s=os.stat(d+'/'+f)
-      print("{}rwx all {:9d} {}".format('d' if (s[0] & 0x4000) else '-',s[6],f))
-  except:
-    print("not a valid directory")
+  for f in os.listdir(d):
+    s=os.stat(d+'/'+f)
+    print("{}rwx all {:9d} {}".format('d' if (s[0] & 0x4000) else '-',s[6],f))
   s=os.statvfs('/')
   print("disk size:{:8d} KB   disk free: {} KB".format(s[0]*s[2]//1024,s[0]*s[3]//1024))
 
@@ -149,26 +124,27 @@ if 'tf_extend.py' in os.listdir():
   ext_cmd=tf_extend.cmd
 
 def _help():
-  print("==Simple shell v1.1")
+  print("==Simple shell v1.2 for Text Files")
   print("  cp/copy <src-file> <dest-file>")
   print("  mv/move <src-file> <dest-file>    \t\trm/del <file>")
-  print("  cd [<folder>]       mkdir <folder>\t\trmdir <folder>")
+  print("  cd [<folder>]\t\tmkdir <folder>\t\trmdir <folder>")
   print("  dir/ls [<folder>]")
   print("  cat/list [-n] [-l <n>,<m>] <file>")
   print("  grep <pattern> <file>")
   print("  sed <pattern> <file>")
-  print("      pattern is '<line-range><op><extra>'   e.g'a/search/replace/', 'x!TODO:!', '43,49d', '8itext'")
-  print("      patterns with spaces require single-quotes   sed ops are one of s/d/i/a/x/X")
-  print("      sed does not work across line boundaries     sed s/x/X-patterns: non-/ delimiters are allowed")
-  print("file names must NOT have embedded spaces           options must be early on the command line")
+  print("      pattern is <line-range><op><extra>   e.g'a/search/replace/', 'x!TODO:!', '43,49d', '8itext'")
+  print("      patterns with spaces require '-quotes\tsed ops are one of s/d/i/a/x/X")
+  print("      sed cannot cross line boundaries\t\tsed s/x/X-patterns: non-/ delimiters are ok")
+  print("file names must NOT have embedded spaces\toptions must be early on the command line")
   ext_cmd('help')
 
 def parseQuotedArgs(st):
+  st=st.strip()  
   if st[0]=="'":
-    p=re.search("'((\'|[^'])*)'",st)
+    p=re.search("'(.*?[^\\\\])'",st)
     if not p:
-      print("quoted pattern error")
-      return ""
+      print("error in quoted pattern:",st)
+      return 
     return p.group(1)
   else:
     return st.split()[0]
@@ -182,7 +158,10 @@ def main():
     if not len(rp): continue
     op=rp[0]
     if op in ('dir','ls'):
-      _dir(rp[1] if len(rp)>1 else '.')
+      try:
+        _dir(rp[1] if len(rp)>1 else '.')
+      except:
+        print("directory not found")
     elif op in ('cat','list'):
       n=(" -n " in r) #print line-nums
       s,e=1,1000000 #start/end
@@ -191,21 +170,28 @@ def main():
         s=e=int(g.group(2))
         if g.group(3):
           e=int(g.group(4)) if g.group(4) and g.group(4).isdigit() else 1000000
-      cat(rp[-1],s,e,numbers=n)
-    elif op=='grep':
+      try:
+        cat(rp[-1],s,e,numbers=n)
+      except:
+        print("file not found",rp[-1])
+    elif op in('grep','sed'):
       if len(rp)<3:
-        print("grep pattern filename") 
+        print(op,"pattern filename") 
         continue
-      grep(rp[-1],parseQuotedArgs(r[5:]),numbers=True)
-    elif op=='sed':
-      if len(rp)<3:
-        print("sed pattern filename")
+      p=parseQuotedArgs(r[4:])
+      if not p:
         continue
-      r=sed(rp[-1],parseQuotedArgs(r[4:]))
-      if r:
-        print("Lines processed: {}  Lines modifed: {}".format(*r))
-    elif op=='cd':
-      os.chdir(rp[1] if len(rp)>1 else '/')
+      try:
+        if op=='grep':
+          grep(rp[-1],p,numbers=True)
+        else:
+          r=sed(rp[-1],p)
+          if r:
+            print("Lines processed: {}  Lines modifed: {}".format(*r))
+      except (ValueError, OSError) as e:
+        print(e)
+      except RuntimeError:
+        print("problem with the regex; try a different pattern")
     elif op=='help':
       _help()
       ext_cmd(rp)
@@ -215,6 +201,8 @@ def main():
       try:
         if op in ('cp','copy'):
           cp(rp[1],rp[2])
+        elif op=='cd':
+          os.chdir(rp[1] if len(rp)>1 else '/')
         elif op=='mkdir':
           os.mkdir(rp[1])
         elif op=='rmdir':
@@ -224,15 +212,15 @@ def main():
         elif op in('rm','del'):
           os.remove(rp[1])
         else:
-          print("command not implemented")
+          print("command not implemented:",op)
       except IndexError:
         print("not enough argments; check syntax with 'help'")
       except OSError:
-        print("file not found")
+        print("file/folder not found or cannot be writtencd")
     gc.collect()
   
 if __name__=="tf":
   print("tf module loaded; members cp(), cat(), cd(), _dir(), grep() and sed()")
   main()
 
-
+# grep 12.*\) dmesg fails