Some hours ago, someone asked on StackExchange about a more pythonic way to represent the following code:
-
with open(read_csv, 'r') as read_file:
-
with open(write_csv, 'r') as write_file:
-
reader = csv.reader(read_file)
-
writer = csv.writer(write_file)
-
for row in reader:
-
#Do some stuff to manipulate the fields from read_file and th
Unfortunately, the user deleted his question, before i could post my suggestion. The only evidence of the question remaining is this tweet from PythonQuestions. I'm posting my extended answer here for future reference and to see, if someone more versed in python might correct me.
Major update
Please have a look at the comments section below and some of the corrections posted there. Some suggested the use of contextlib, which is a part of the standard library i didn't know about yet.. Also, it seems python >2.7 can handle more than one context at a time so that my "solution" isn't really needed.
My try
Starting off, i do not think that the syntax used is too unpythonic. If it is only used once or twice, i wouldn't change anything about it, but if someone likes to open more than one file within a single with-statement often, then writing a proxy class that adheres to the with-statements syntax might be useful. For opening two files it would look like this:
-
class openTwoFiles():
-
def __init__(self, file1, file2):
-
self.fn1 = file1
-
self.fn2 = file2
-
-
def __enter__(self):
-
self.f1 = open(self.fn1,'r')
-
self.f2 = open(self.fn2,'r')
-
return (self.f1, self.f2)
-
-
def __exit__(self, type, value, traceback):
-
self.f1.close()
-
self.f2.close()
The class openTwoFiles implements the two basic methods __enter__ and __exit__ which the with-statement needs to execute. Using this class might look like this:
-
with openTwoFiles('test.txt','test2.txt') as (f1,f2):
-
print f1
-
print f2
Of course for the specific code in question the implementation of the csv-class and an iterator over the reader might also help, but i'm sticking with the general case, in which the opening of the files suffices. The class openTwoFiles might be extended to open a arbitrary number of files with custom flags:
-
class openFiles():
-
def __init__(self, files, flags):
-
if isinstance(files,basestring):
-
files = [files]
-
if isinstance(flags,basestring):
-
flags = [flags]
-
assert len(flags)==len(files)
-
self.files = files
-
self.flags = flags
-
-
def __enter__(self):
-
self.fhs = []
-
for f, fl in zip(self.files, self.flags):
-
self.fhs.append(open(f,fl))
-
return self.fhs
-
-
def __exit__(self, type, value, traceback):
-
for f in self.fhs:
-
f.close()
-
-
with openFiles(['test.txt','test2.txt'], ['r','r']) as ll:
-
print ll
Careful. Your context manager is worse than the original version that uses a nested with statement. The original version will be sure to close the first file if there is an exception opening the second file. Your version will not do this, though: it will raise an exception out of __enter__ and __exit__ will never be called, leaving the first file open until the garbage collector (or reference counting) deals with it.
I suspect it’d be more Pythonic (although this is up to some debate) and would lead to a simpler implementation if the context manager took *args in its __init__, where *args was a tuple of tuples, and where each tuple was (filename, mode) or optionally just (filename, ) — thus, each tuple in *args can be passed directly to file() or open(). In other words — https://gist.github.com/1578946
with open(‘file1′) as f1, open(‘file2′) as f2:
…
For older versions of Python there’s contextlib.nested.
This could be written as
with open(read_csv, ‘r’) as read_file, open(write_csv, ‘r’) as write_file:
pass
in python 2.7 and python 3, and as
with contextlib.nested(open(read_csv, ‘r’), open(write_csv, ‘r’)) as read_file, write_file:
pass
in previous versions.
This seems to work just fine when everything is working fine. However, what happens when you try to open three files, and the second one can’t be opened? You’ll be left with an inconsistent state.
I think that in new version of python this is also possible:
with open(read_csv, ‘r’) as read_file, open(write_csv, ‘w’) as write_file:
I python 2.7 and later, is possible to do this:
with open(‘file1′) as file1, open(‘file2′) as file2:
# Read file1 and write in file2
python 2.7 and 3.2.
with open(‘one.file’) as file_one, open(‘second.file’) as file_second:
print(file_one)
print(file_second)
Maybe contextlib.nested could be of use?
[python]
from contextlib import nested
with nested(open(‘file1′), open(‘file2′)) as (f1, f2):
f1.readlines()
f2.readlines()
[/python]
I’d prefer more generic behaviour for this so that you could do e.g.:
with opens(‘/tmp/foo.py’, (‘/tmp/bar.py’, ‘w’)) as (f1, f2):
f2.write(f1.read())
Something like the following would do the trick:
import contextlib
@contextlib.contextmanager
def opens(*fnames):
fs = []
for x in fnames:
if not isinstance(x, tuple):
x = (x, ‘r’)
fs.append(open(*x))
try:
yield fs
finally:
for f in fs:
f.close()
There are two major issues with your implementation:
1. If an error happens while closing a file, all next files will be left open
2. If opening one of the file fails, all files previously opened will be left open
You should also close the files in the inverse order than the one you opened them in, but that’s a minor issue.
Overall, the original code should be left as-is when using Python 2.5 or 2.6. For Python 2.7-only code or Python 3 code, `with` can be used with multiple managers out of the box:
with open(path1, ‘r’) as file1, open(path2, ‘r’) as file2:
# stuff.
See examples at the bottom of the “with statement” section: http://docs.python.org/reference/compound_stmts.html#with
you can also go to python 32 context managers:
with open(‘in.txt’) as infile, open(‘out.txt’, ‘w’) as outfile:
This seems a long way to go to just (works in 2.7):
with open(‘file1′) as f,\
open(‘file2) as g:
…
In python 2.7 you can open two files in the same with statement.
with open(x) as u, open(y) as w:
pass
Wow. Thank you guys for the input, i will have to go through the comments slowly.
Nooo, don’t use contextlib.nested(). It’s an error-prone bug trap, and this kind of situation is exactly why it was deprecated in 2.7 and 3.2 and will be gone completely in 3.3. (Quiz: what happens if the first open() call succeeds, but the second one fails?)
I’m the main contextlib maintainer, and contextlib2.ContextStack is a new API I have created and published on PyPI that will hopefully provide the dynamic context management benefits of nested() without being anywhere near as error-prone. The “open a data driven number files at the same time” use case is actually the recurring example I use in the documentation (http://contextlib2.readthedocs.org/en/latest/index.html#contextlib2.ContextStack)
This would be more Pythonic if it followed PEP-8 & avoided camelCase.