1.3.3.9. File utilities¶
See also the api documentation: file
1.3.3.9.1. Introduction¶
- The 
filemodule provides various file system related features such as: - filesystem traversal with depth support
 - file search, wildcard or regex based
 - file rollover (backup)
 - size parsing and formatting
 - directory creation without error on existing directory
 
1.3.3.9.3. File search¶
The find/xfind/efind/xefind functions are intended to provide flexible file or directories lookup.
xfind and xefind are generator functions their call do not build a potentially huge file/directory list so that you can process the result in a loop or use the optionnal callbacks described below.
find and efind simply call xfind and xefind to return the lokkup result as a list.
The simpliest way to make a search is done by using wildcard pattern (?*[seq][!seq])
with find/xfind. See fnmatch for wildcard patterns details, how they work and
are supported.
Sometimes the need for a more precise lookup appear, the regular expression are a
very powerfull way for that goal although they need a bit of understanding. See
re and http://www.expreg.com/.
Another advantage to the regex method is that you can extract parts of the file path
you selected by using the getmatch parameters with efind/xefind.
- find and efind have nearly common features but few differences exists:
 - find/xfind support single or list of pattern/exclude, efind/xefind do not, this should not be needed because of the regular expression capabilities.
 - efind/xefind support returning (matching path, match object) couples, find/xfind do not
 
Using wildcard expressions: See xfind() and find()
Using regular expressions: See xefind() and efind()
1.3.3.9.3.1. The example¶
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import vacumm.misc.file as F
print 'Changing the working directory to vacumm\'s root directory'
print
PWD = os.getcwd()
os.chdir(os.path.dirname(os.path.dirname(os.path.dirname(os.path.realpath(__file__)))))
searchdir = '.'
fkw = dict(abspath=False)
# Using wildcard expressions
# ==========================
# A simple example
print 'Searching any file starting with "setup." (depth=0):\n'
print ' ','\n  '.join(F.find('setup*', searchdir, depth=0, **fkw))
print
# Controlling the search depth
print 'Searching any file starting with "misc.file." (depth=None):\n'
print ' ','\n  '.join(F.find('misc.file.*', searchdir, depth=None,
    exclude=('*/build/*', ), **fkw))
print
# we've set the depth to None to supress the limit in recursion (which defaults
# to 0 meaning 0 levels from the search directory)
# Multiple patterns
print 'Searching any file starting with "misc.config*" and ending with ".py", ".ini" or ".cfg":\n'
print ' ','\n  '.join(F.find(('misc.config*.py', 'misc.config*.ini', 'misc.config*.cfg',), searchdir, depth=None, **fkw))
print
# How to exclude
print 'Searching any file starting with "misc.file.", excluding "library" and ".svn" directories and "*.rst" files:\n'
print ' ','\n  '.join(F.find('*/misc.file.*', searchdir, depth=None,
    exclude=('*/library/*', '*/.svn/*', '*.rst'), matchall=True, **fkw))
print
# by default only file names are evaluated
# we've set matchall to True because .svn is part of the whole paths to be evaluated
# also note that we had to change the pattern no match whole paths
# Search directories only
# By default only files are looked up
print 'Searching any directory named misc:\n'
print ' ','\n  '.join(F.find('misc', searchdir, depth=None, files=False, dirs=True, **fkw))
print
# Using callbacks
# ===============
# Controlling the search depth
print 'Showing search results with callbacks:\n'
def ondir(e):
    # we want to limit the outputs of this tutorial...
    if '/.svn' not in e and '/.git' not in e and '/build' not in e:
        print '  + evaluating directory:', e
def onfile(e):
    # we want to limit the outputs of this tutorial...
    if '/.svn' not in e and 'misc.file.' in e and '/.git' not in e and '/build' not in e:
        print '  - evaluating file:', e
def onmatch(e):
    print '  * matching entry:', e
F.find('*/misc.file.*', searchdir, depth=None,
    exclude=('*/library/*', '*/.svn/*', '*.rst', '*/.??*', '*/build/*'),
    matchall=True, ondir=ondir, onfile=onfile, onmatch=onmatch, **fkw)
print
# note that ondir and onfile callbacks are called even if the path does not matches
# Using regular expressions
# =========================
# A simple example
print 'Searching any file starting with "misc.file.":\n'
print ' ','\n  '.join(F.efind('misc\.file\..*', searchdir, **fkw))
print
# An advanced example which uses regex match results
print 'Searching any file starting with "misc.file." and show matching groups:\n'
for filepath, matchobj in F.xefind('misc\.file\.(.*)\.(?:py|rst)', searchdir, getmatch=True, **fkw):
    print '  file path: %s, match groups: %s'%(filepath, matchobj.groups())
print
# grouping is done with parenthesis (), we could also have used named group (?P<groupname1>groupexp) and matchobj.groupdict()
# note that the second group is a non capturing one, we needed this to include both '.py' and '.rst' files
# also note that we use xefind instead of efind, this way we could avoid buildnig a huge list of (filepath,matchobj) couples
# this is not the case here, just an example
os.chdir(PWD)
Changing the working directory to vacumm's root directory
Searching any file starting with "setup." (depth=0):
  ./setup.py
  ./setup.cfg.omp
  ./setup.cfg.simple
  ./setup.cfg
  ./setup.pyc
Searching any file starting with "misc.file." (depth=None):
  ./doc/sphinx/source/library/misc.file.rst
  ./doc/sphinx/source/tutorials/misc.file.rst
  ./doc/sphinx/build/html/_sources/library/misc.file.rst.txt
  ./doc/sphinx/build/html/_sources/tutorials/misc.file.rst.txt
  ./doc/sphinx/build/html/library/misc.file.html
  ./doc/sphinx/build/html/tutorials/misc.file.html
  ./doc/sphinx/build/doctrees/library/misc.file.doctree
  ./doc/sphinx/build/doctrees/tutorials/misc.file.doctree
  ./scripts/tutorials/misc.file.find.py
  ./scripts/tutorials/misc.file.mkdirs.py
  ./scripts/tutorials/misc.file.rollover.py
  ./scripts/tutorials/misc.file.strfsize.py
Searching any file starting with "misc.config*" and ending with ".py", ".ini" or ".cfg":
  ./scripts/tutorials/misc.config.argparse.cfg
  ./scripts/tutorials/misc.config.argparse.ini
  ./scripts/tutorials/misc.config.argparse.py
  ./scripts/tutorials/misc.config.cfg
  ./scripts/tutorials/misc.config.ini
  ./scripts/tutorials/misc.config.py
Searching any file starting with "misc.file.", excluding "library" and ".svn" directories and "*.rst" files:
  ./doc/sphinx/build/html/_sources/tutorials/misc.file.rst.txt
  ./doc/sphinx/build/html/tutorials/misc.file.html
  ./doc/sphinx/build/doctrees/tutorials/misc.file.doctree
  ./scripts/tutorials/misc.file.find.py
  ./scripts/tutorials/misc.file.mkdirs.py
  ./scripts/tutorials/misc.file.rollover.py
  ./scripts/tutorials/misc.file.strfsize.py
Searching any directory named misc:
  ./bin/testdev/misc
  ./doc/sphinx/build/html/_modules/vacumm/misc
  ./doc/sphinx/build/html/_modules/vacumm/data/misc
  ./lib/python/vacumm/misc
  ./lib/python/vacumm/data/misc
  ./build/src.linux-x86_64-2.7/vacumm/misc
  ./build/src.linux-x86_64-2.7/build/src.linux-x86_64-2.7/vacumm/misc
  ./build/temp.linux-x86_64-2.7/build/src.linux-x86_64-2.7/vacumm/misc
  ./build/temp.linux-x86_64-2.7/build/src.linux-x86_64-2.7/build/src.linux-x86_64-2.7/vacumm/misc
  ./build/temp.linux-x86_64-2.7/lib/python/vacumm/misc
Showing search results with callbacks:
  + evaluating directory: ./bin
  + evaluating directory: ./data
  + evaluating directory: ./doc
  + evaluating directory: ./etc
  + evaluating directory: ./lib
  + evaluating directory: ./scripts
  + evaluating directory: ./test
  + evaluating directory: ./.spyproject
  + evaluating directory: ./.pytest_cache
  + evaluating directory: ./.circleci
  + evaluating directory: ./dist
  + evaluating directory: ./bin/testdev
  + evaluating directory: ./bin/testdev/matlab_testdev
  + evaluating directory: ./bin/testdev/misc
  + evaluating directory: ./bin/testdev/scripts
  + evaluating directory: ./bin/testdev/misc/coloc
  + evaluating directory: ./bin/testdev/misc/stat
  + evaluating directory: ./bin/testdev/scripts/examples
  + evaluating directory: ./bin/testdev/scripts/scripts_old_chaine_sst
  + evaluating directory: ./data/ne_110m_land
  + evaluating directory: ./data/sea_level
  + evaluating directory: ./doc/sphinx
  + evaluating directory: ./doc/sphinx/source
  + evaluating directory: ./doc/sphinx/source/courses
  + evaluating directory: ./doc/sphinx/source/library
  + evaluating directory: ./doc/sphinx/source/sphinxext
  + evaluating directory: ./doc/sphinx/source/static
  + evaluating directory: ./doc/sphinx/source/templates
  + evaluating directory: ./doc/sphinx/source/tests
  + evaluating directory: ./doc/sphinx/source/tutorials
  + evaluating directory: ./doc/sphinx/source/bin
  + evaluating directory: ./doc/sphinx/source/images
  - evaluating file: ./doc/sphinx/source/library/misc.file.rst
  - evaluating file: ./doc/sphinx/source/tutorials/misc.file.rst
  + evaluating directory: ./etc/modulefiles
  + evaluating directory: ./lib/fortran
  + evaluating directory: ./lib/python
  + evaluating directory: ./lib/python/vacumm
  + evaluating directory: ./lib/python/vacumm/bathy
  + evaluating directory: ./lib/python/vacumm/data
  + evaluating directory: ./lib/python/vacumm/diag
  + evaluating directory: ./lib/python/vacumm/markup
  + evaluating directory: ./lib/python/vacumm/misc
  + evaluating directory: ./lib/python/vacumm/report
  + evaluating directory: ./lib/python/vacumm/sphinxext
  + evaluating directory: ./lib/python/vacumm/tide
  + evaluating directory: ./lib/python/vacumm/validator
  + evaluating directory: ./lib/python/vacumm/data/in_situ
  + evaluating directory: ./lib/python/vacumm/data/misc
  + evaluating directory: ./lib/python/vacumm/data/model
  + evaluating directory: ./lib/python/vacumm/data/satellite
  + evaluating directory: ./lib/python/vacumm/data/model/mars
  + evaluating directory: ./lib/python/vacumm/markup/markup-1.7
  + evaluating directory: ./lib/python/vacumm/markup/markup-1.7/doc
  + evaluating directory: ./lib/python/vacumm/misc/axml
  + evaluating directory: ./lib/python/vacumm/misc/cfgui
  + evaluating directory: ./lib/python/vacumm/misc/cpt
  + evaluating directory: ./lib/python/vacumm/misc/grid
  + evaluating directory: ./lib/python/vacumm/misc/phys
  + evaluating directory: ./lib/python/vacumm/misc/cfgui/controllers
  + evaluating directory: ./lib/python/vacumm/misc/cfgui/models
  + evaluating directory: ./lib/python/vacumm/misc/cfgui/resources
  + evaluating directory: ./lib/python/vacumm/misc/cfgui/tests
  + evaluating directory: ./lib/python/vacumm/misc/cfgui/utils
  + evaluating directory: ./lib/python/vacumm/misc/cfgui/views
  + evaluating directory: ./lib/python/vacumm/misc/cfgui/resources/ui
  + evaluating directory: ./lib/python/vacumm/misc/cfgui/tests/__pycache__
  + evaluating directory: ./lib/python/vacumm/report/ifroco
  + evaluating directory: ./lib/python/vacumm/validator/valid
  + evaluating directory: ./scripts/courses
  + evaluating directory: ./scripts/install
  + evaluating directory: ./scripts/test
  + evaluating directory: ./scripts/tutorials
  + evaluating directory: ./scripts/my
  + evaluating directory: ./scripts/test/__pycache__
  - evaluating file: ./scripts/tutorials/misc.file.find.py
  - evaluating file: ./scripts/tutorials/misc.file.mkdirs.py
  - evaluating file: ./scripts/tutorials/misc.file.rollover.py
  - evaluating file: ./scripts/tutorials/misc.file.strfsize.py
  * matching entry: ./scripts/tutorials/misc.file.find.py
  * matching entry: ./scripts/tutorials/misc.file.mkdirs.py
  * matching entry: ./scripts/tutorials/misc.file.rollover.py
  * matching entry: ./scripts/tutorials/misc.file.strfsize.py
  + evaluating directory: ./test/__pycache__
  + evaluating directory: ./.pytest_cache/v
  + evaluating directory: ./.pytest_cache/v/cache
Searching any file starting with "misc.file.":
  
Searching any file starting with "misc.file." and show matching groups:
1.3.3.9.4. Making file backups (called rollover)¶
See rollover()
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from shutil import rmtree
from tempfile import mkdtemp
from os.path import join
import vacumm.misc.file as F
tmpdir = mkdtemp(dir='.')
afile = join(tmpdir, 'afile')
count = 2
suffix = '.backup%d'
try:
    F.rollover(afile, count=count, suffix=suffix)
    # nothing done, afile does not exists
    with file(afile, 'w') as f: f.write('0')
    # we created afile
    # afile contains 0
    F.rollover(afile, count=count, suffix=suffix)
    # did a copy of afile to afile.backup1
    with file(afile, 'w') as f: f.write('1')
    # afile contains 1
    # afile.backup1 contains 0
    F.rollover(afile, count=count, suffix=suffix)
    # did a copy of afile.backup1 to afile.backup2, and a copy of afile to afile.backup2
    with file(afile, 'w') as f: f.write('2')
    # afile contains 2
    # afile.backup1 contains 1
    # afile.backup2 contains 0
    F.rollover(afile, count=count, suffix=suffix)
    # afile.backup2 removed, copy of afile.backup1 to afile.backup2, copy of afile to afile.backup2
    # afile contains 2
    # afile.backup1 contains 2
    # afile.backup2 contains 1
finally:
    # cleaning
    rmtree(tmpdir)
print 'Empty output'
1.3.3.9.5. Displaying and analysing file sizes¶
See strfsize() and strpsize()
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import vacumm.misc.file as F
sizes = (
    1 * 10**3,
    1 * 2**10,
    1 * 10**6,
    1 * 2**20,
    1 * 10**9,
    1 * 2**30,
    1 * 10**12,
    1 * 2**40,
    1 * 10**15,
    1 * 2**50,
)
for size in sizes:
    fsize = F.strfsize(size, si=False)
    fsisize = F.strfsize(size, si=True)
    print 'size: %(size)14d, formatted: %(fsize)8s (CEI, SI: %(fsisize)8s)'%vars()
size:           1000, formatted:  1000 io (CEI, SI:     1 Ko)
size:           1024, formatted:    1 Kio (CEI, SI: 1.024 Ko)
size:        1000000, formatted: 976.562 Kio (CEI, SI:     1 Mo)
size:        1048576, formatted:    1 Mio (CEI, SI: 1.049 Mo)
size:     1000000000, formatted: 953.674 Mio (CEI, SI:     1 Go)
size:     1073741824, formatted:    1 Gio (CEI, SI: 1.074 Go)
size:  1000000000000, formatted: 931.323 Gio (CEI, SI:     1 To)
size:  1099511627776, formatted:    1 Tio (CEI, SI: 1.100 To)
size: 1000000000000000, formatted: 909.495 Tio (CEI, SI:     1 Po)
size: 1125899906842624, formatted:    1 Pio (CEI, SI: 1.126 Po)
1.3.3.9.6. Creating directories¶
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from os.path import abspath, dirname, join, realpath
from shutil import rmtree
from tempfile import mkdtemp
import vacumm.misc.file as F
tmpdir = mkdtemp(dir='.')
adir = join(tmpdir, 'path/to/adir')
afile = join(adir, 'afile')
anotherdir = join(tmpdir, 'path/to/anotherdir')
anotherfile = join(anotherdir, 'anotherfile')
try:
    
    print 'Created directory:', F.mkdirs(adir)
    F.mkfdirs(afile) # no effect, already created
    print 'Created directories:',  F.mkfdirs((afile, anotherfile))
    F.mkdirs((adir, anotherdir)) # no effect, already created
    
finally:
    rmtree(tmpdir)
Created directory: ./tmpWwq3u2/path/to/adir
Created directories: ['./tmpWwq3u2/path/to/anotherdir']