1.3.3.9. File utilities¶
See also the api documentation: file
1.3.3.9.1. Introduction¶
- The
file
module provides various file system related features such as: - filesystem traversal with depth support
- file search, wildcard or regex based
- file rollover (backup)
- size parsing and formatting
- directory creation without error on existing directory
1.3.3.9.3. File search¶
The find/xfind/efind/xefind functions are intended to provide flexible file or directories lookup.
xfind and xefind are generator functions their call do not build a potentially huge file/directory list so that you can process the result in a loop or use the optionnal callbacks described below.
find and efind simply call xfind and xefind to return the lokkup result as a list.
The simpliest way to make a search is done by using wildcard pattern (?*[seq][!seq])
with find/xfind. See fnmatch
for wildcard patterns details, how they work and
are supported.
Sometimes the need for a more precise lookup appear, the regular expression are a
very powerfull way for that goal although they need a bit of understanding. See
re
and http://www.expreg.com/.
Another advantage to the regex method is that you can extract parts of the file path
you selected by using the getmatch parameters with efind/xefind.
- find and efind have nearly common features but few differences exists:
- find/xfind support single or list of pattern/exclude, efind/xefind do not, this should not be needed because of the regular expression capabilities.
- efind/xefind support returning (matching path, match object) couples, find/xfind do not
Using wildcard expressions: See xfind()
and find()
Using regular expressions: See xefind()
and efind()
1.3.3.9.3.1. The example¶
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import vacumm.misc.file as F
print 'Changing the working directory to vacumm\'s root directory'
print
PWD = os.getcwd()
os.chdir(os.path.dirname(os.path.dirname(os.path.dirname(os.path.realpath(__file__)))))
searchdir = '.'
fkw = dict(abspath=False)
# Using wildcard expressions
# ==========================
# A simple example
print 'Searching any file starting with "setup." (depth=0):\n'
print ' ','\n '.join(F.find('setup*', searchdir, depth=0, **fkw))
print
# Controlling the search depth
print 'Searching any file starting with "misc.file." (depth=None):\n'
print ' ','\n '.join(F.find('misc.file.*', searchdir, depth=None,
exclude=('*/build/*', ), **fkw))
print
# we've set the depth to None to supress the limit in recursion (which defaults
# to 0 meaning 0 levels from the search directory)
# Multiple patterns
print 'Searching any file starting with "misc.config*" and ending with ".py", ".ini" or ".cfg":\n'
print ' ','\n '.join(F.find(('misc.config*.py', 'misc.config*.ini', 'misc.config*.cfg',), searchdir, depth=None, **fkw))
print
# How to exclude
print 'Searching any file starting with "misc.file.", excluding "library" and ".svn" directories and "*.rst" files:\n'
print ' ','\n '.join(F.find('*/misc.file.*', searchdir, depth=None,
exclude=('*/library/*', '*/.svn/*', '*.rst'), matchall=True, **fkw))
print
# by default only file names are evaluated
# we've set matchall to True because .svn is part of the whole paths to be evaluated
# also note that we had to change the pattern no match whole paths
# Search directories only
# By default only files are looked up
print 'Searching any directory named misc:\n'
print ' ','\n '.join(F.find('misc', searchdir, depth=None, files=False, dirs=True, **fkw))
print
# Using callbacks
# ===============
# Controlling the search depth
print 'Showing search results with callbacks:\n'
def ondir(e):
# we want to limit the outputs of this tutorial...
if '/.svn' not in e and '/.git' not in e and '/build' not in e:
print ' + evaluating directory:', e
def onfile(e):
# we want to limit the outputs of this tutorial...
if '/.svn' not in e and 'misc.file.' in e and '/.git' not in e and '/build' not in e:
print ' - evaluating file:', e
def onmatch(e):
print ' * matching entry:', e
F.find('*/misc.file.*', searchdir, depth=None,
exclude=('*/library/*', '*/.svn/*', '*.rst', '*/.??*', '*/build/*'),
matchall=True, ondir=ondir, onfile=onfile, onmatch=onmatch, **fkw)
print
# note that ondir and onfile callbacks are called even if the path does not matches
# Using regular expressions
# =========================
# A simple example
print 'Searching any file starting with "misc.file.":\n'
print ' ','\n '.join(F.efind('misc\.file\..*', searchdir, **fkw))
print
# An advanced example which uses regex match results
print 'Searching any file starting with "misc.file." and show matching groups:\n'
for filepath, matchobj in F.xefind('misc\.file\.(.*)\.(?:py|rst)', searchdir, getmatch=True, **fkw):
print ' file path: %s, match groups: %s'%(filepath, matchobj.groups())
print
# grouping is done with parenthesis (), we could also have used named group (?P<groupname1>groupexp) and matchobj.groupdict()
# note that the second group is a non capturing one, we needed this to include both '.py' and '.rst' files
# also note that we use xefind instead of efind, this way we could avoid buildnig a huge list of (filepath,matchobj) couples
# this is not the case here, just an example
os.chdir(PWD)
Changing the working directory to vacumm's root directory
Searching any file starting with "setup." (depth=0):
./setup.py
./setup.cfg.omp
./setup.cfg.simple
./setup.cfg
./setup.pyc
Searching any file starting with "misc.file." (depth=None):
./doc/sphinx/source/library/misc.file.rst
./doc/sphinx/source/tutorials/misc.file.rst
./doc/sphinx/build/html/_sources/library/misc.file.rst.txt
./doc/sphinx/build/html/_sources/tutorials/misc.file.rst.txt
./doc/sphinx/build/html/library/misc.file.html
./doc/sphinx/build/html/tutorials/misc.file.html
./doc/sphinx/build/doctrees/library/misc.file.doctree
./doc/sphinx/build/doctrees/tutorials/misc.file.doctree
./scripts/tutorials/misc.file.find.py
./scripts/tutorials/misc.file.mkdirs.py
./scripts/tutorials/misc.file.rollover.py
./scripts/tutorials/misc.file.strfsize.py
Searching any file starting with "misc.config*" and ending with ".py", ".ini" or ".cfg":
./scripts/tutorials/misc.config.argparse.cfg
./scripts/tutorials/misc.config.argparse.ini
./scripts/tutorials/misc.config.argparse.py
./scripts/tutorials/misc.config.cfg
./scripts/tutorials/misc.config.ini
./scripts/tutorials/misc.config.py
Searching any file starting with "misc.file.", excluding "library" and ".svn" directories and "*.rst" files:
./doc/sphinx/build/html/_sources/tutorials/misc.file.rst.txt
./doc/sphinx/build/html/tutorials/misc.file.html
./doc/sphinx/build/doctrees/tutorials/misc.file.doctree
./scripts/tutorials/misc.file.find.py
./scripts/tutorials/misc.file.mkdirs.py
./scripts/tutorials/misc.file.rollover.py
./scripts/tutorials/misc.file.strfsize.py
Searching any directory named misc:
./bin/testdev/misc
./doc/sphinx/build/html/_modules/vacumm/misc
./doc/sphinx/build/html/_modules/vacumm/data/misc
./lib/python/vacumm/misc
./lib/python/vacumm/data/misc
./build/src.linux-x86_64-2.7/vacumm/misc
./build/src.linux-x86_64-2.7/build/src.linux-x86_64-2.7/vacumm/misc
./build/temp.linux-x86_64-2.7/build/src.linux-x86_64-2.7/vacumm/misc
./build/temp.linux-x86_64-2.7/build/src.linux-x86_64-2.7/build/src.linux-x86_64-2.7/vacumm/misc
./build/temp.linux-x86_64-2.7/lib/python/vacumm/misc
Showing search results with callbacks:
+ evaluating directory: ./bin
+ evaluating directory: ./data
+ evaluating directory: ./doc
+ evaluating directory: ./etc
+ evaluating directory: ./lib
+ evaluating directory: ./scripts
+ evaluating directory: ./test
+ evaluating directory: ./.spyproject
+ evaluating directory: ./.pytest_cache
+ evaluating directory: ./.circleci
+ evaluating directory: ./dist
+ evaluating directory: ./bin/testdev
+ evaluating directory: ./bin/testdev/matlab_testdev
+ evaluating directory: ./bin/testdev/misc
+ evaluating directory: ./bin/testdev/scripts
+ evaluating directory: ./bin/testdev/misc/coloc
+ evaluating directory: ./bin/testdev/misc/stat
+ evaluating directory: ./bin/testdev/scripts/examples
+ evaluating directory: ./bin/testdev/scripts/scripts_old_chaine_sst
+ evaluating directory: ./data/ne_110m_land
+ evaluating directory: ./data/sea_level
+ evaluating directory: ./doc/sphinx
+ evaluating directory: ./doc/sphinx/source
+ evaluating directory: ./doc/sphinx/source/courses
+ evaluating directory: ./doc/sphinx/source/library
+ evaluating directory: ./doc/sphinx/source/sphinxext
+ evaluating directory: ./doc/sphinx/source/static
+ evaluating directory: ./doc/sphinx/source/templates
+ evaluating directory: ./doc/sphinx/source/tests
+ evaluating directory: ./doc/sphinx/source/tutorials
+ evaluating directory: ./doc/sphinx/source/bin
+ evaluating directory: ./doc/sphinx/source/images
- evaluating file: ./doc/sphinx/source/library/misc.file.rst
- evaluating file: ./doc/sphinx/source/tutorials/misc.file.rst
+ evaluating directory: ./etc/modulefiles
+ evaluating directory: ./lib/fortran
+ evaluating directory: ./lib/python
+ evaluating directory: ./lib/python/vacumm
+ evaluating directory: ./lib/python/vacumm/bathy
+ evaluating directory: ./lib/python/vacumm/data
+ evaluating directory: ./lib/python/vacumm/diag
+ evaluating directory: ./lib/python/vacumm/markup
+ evaluating directory: ./lib/python/vacumm/misc
+ evaluating directory: ./lib/python/vacumm/report
+ evaluating directory: ./lib/python/vacumm/sphinxext
+ evaluating directory: ./lib/python/vacumm/tide
+ evaluating directory: ./lib/python/vacumm/validator
+ evaluating directory: ./lib/python/vacumm/data/in_situ
+ evaluating directory: ./lib/python/vacumm/data/misc
+ evaluating directory: ./lib/python/vacumm/data/model
+ evaluating directory: ./lib/python/vacumm/data/satellite
+ evaluating directory: ./lib/python/vacumm/data/model/mars
+ evaluating directory: ./lib/python/vacumm/markup/markup-1.7
+ evaluating directory: ./lib/python/vacumm/markup/markup-1.7/doc
+ evaluating directory: ./lib/python/vacumm/misc/axml
+ evaluating directory: ./lib/python/vacumm/misc/cfgui
+ evaluating directory: ./lib/python/vacumm/misc/cpt
+ evaluating directory: ./lib/python/vacumm/misc/grid
+ evaluating directory: ./lib/python/vacumm/misc/phys
+ evaluating directory: ./lib/python/vacumm/misc/cfgui/controllers
+ evaluating directory: ./lib/python/vacumm/misc/cfgui/models
+ evaluating directory: ./lib/python/vacumm/misc/cfgui/resources
+ evaluating directory: ./lib/python/vacumm/misc/cfgui/tests
+ evaluating directory: ./lib/python/vacumm/misc/cfgui/utils
+ evaluating directory: ./lib/python/vacumm/misc/cfgui/views
+ evaluating directory: ./lib/python/vacumm/misc/cfgui/resources/ui
+ evaluating directory: ./lib/python/vacumm/misc/cfgui/tests/__pycache__
+ evaluating directory: ./lib/python/vacumm/report/ifroco
+ evaluating directory: ./lib/python/vacumm/validator/valid
+ evaluating directory: ./scripts/courses
+ evaluating directory: ./scripts/install
+ evaluating directory: ./scripts/test
+ evaluating directory: ./scripts/tutorials
+ evaluating directory: ./scripts/my
+ evaluating directory: ./scripts/test/__pycache__
- evaluating file: ./scripts/tutorials/misc.file.find.py
- evaluating file: ./scripts/tutorials/misc.file.mkdirs.py
- evaluating file: ./scripts/tutorials/misc.file.rollover.py
- evaluating file: ./scripts/tutorials/misc.file.strfsize.py
* matching entry: ./scripts/tutorials/misc.file.find.py
* matching entry: ./scripts/tutorials/misc.file.mkdirs.py
* matching entry: ./scripts/tutorials/misc.file.rollover.py
* matching entry: ./scripts/tutorials/misc.file.strfsize.py
+ evaluating directory: ./test/__pycache__
+ evaluating directory: ./.pytest_cache/v
+ evaluating directory: ./.pytest_cache/v/cache
Searching any file starting with "misc.file.":
Searching any file starting with "misc.file." and show matching groups:
1.3.3.9.4. Making file backups (called rollover)¶
See rollover()
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from shutil import rmtree
from tempfile import mkdtemp
from os.path import join
import vacumm.misc.file as F
tmpdir = mkdtemp(dir='.')
afile = join(tmpdir, 'afile')
count = 2
suffix = '.backup%d'
try:
F.rollover(afile, count=count, suffix=suffix)
# nothing done, afile does not exists
with file(afile, 'w') as f: f.write('0')
# we created afile
# afile contains 0
F.rollover(afile, count=count, suffix=suffix)
# did a copy of afile to afile.backup1
with file(afile, 'w') as f: f.write('1')
# afile contains 1
# afile.backup1 contains 0
F.rollover(afile, count=count, suffix=suffix)
# did a copy of afile.backup1 to afile.backup2, and a copy of afile to afile.backup2
with file(afile, 'w') as f: f.write('2')
# afile contains 2
# afile.backup1 contains 1
# afile.backup2 contains 0
F.rollover(afile, count=count, suffix=suffix)
# afile.backup2 removed, copy of afile.backup1 to afile.backup2, copy of afile to afile.backup2
# afile contains 2
# afile.backup1 contains 2
# afile.backup2 contains 1
finally:
# cleaning
rmtree(tmpdir)
print 'Empty output'
1.3.3.9.5. Displaying and analysing file sizes¶
See strfsize()
and strpsize()
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import vacumm.misc.file as F
sizes = (
1 * 10**3,
1 * 2**10,
1 * 10**6,
1 * 2**20,
1 * 10**9,
1 * 2**30,
1 * 10**12,
1 * 2**40,
1 * 10**15,
1 * 2**50,
)
for size in sizes:
fsize = F.strfsize(size, si=False)
fsisize = F.strfsize(size, si=True)
print 'size: %(size)14d, formatted: %(fsize)8s (CEI, SI: %(fsisize)8s)'%vars()
size: 1000, formatted: 1000 io (CEI, SI: 1 Ko)
size: 1024, formatted: 1 Kio (CEI, SI: 1.024 Ko)
size: 1000000, formatted: 976.562 Kio (CEI, SI: 1 Mo)
size: 1048576, formatted: 1 Mio (CEI, SI: 1.049 Mo)
size: 1000000000, formatted: 953.674 Mio (CEI, SI: 1 Go)
size: 1073741824, formatted: 1 Gio (CEI, SI: 1.074 Go)
size: 1000000000000, formatted: 931.323 Gio (CEI, SI: 1 To)
size: 1099511627776, formatted: 1 Tio (CEI, SI: 1.100 To)
size: 1000000000000000, formatted: 909.495 Tio (CEI, SI: 1 Po)
size: 1125899906842624, formatted: 1 Pio (CEI, SI: 1.126 Po)
1.3.3.9.6. Creating directories¶
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from os.path import abspath, dirname, join, realpath
from shutil import rmtree
from tempfile import mkdtemp
import vacumm.misc.file as F
tmpdir = mkdtemp(dir='.')
adir = join(tmpdir, 'path/to/adir')
afile = join(adir, 'afile')
anotherdir = join(tmpdir, 'path/to/anotherdir')
anotherfile = join(anotherdir, 'anotherfile')
try:
print 'Created directory:', F.mkdirs(adir)
F.mkfdirs(afile) # no effect, already created
print 'Created directories:', F.mkfdirs((afile, anotherfile))
F.mkdirs((adir, anotherdir)) # no effect, already created
finally:
rmtree(tmpdir)
Created directory: ./tmpWwq3u2/path/to/adir
Created directories: ['./tmpWwq3u2/path/to/anotherdir']