python - Dumping JSON directly into a tarfile -
i have large list of dict objects. store list in tar file exchange remotely. have done writing json.dumps() string tarfile object opened in 'w:gz' mode.
i trying piped implementation, opening tarfile object in 'w|gz' mode. here code far:
from json import dump io import stringio import tarfile stringio() out_stream, tarfile.open(filename, 'w|gz', out_stream) tar_file: packet in json_io_format(data): dump(packet, out_stream)
this code in function 'write_data'. 'json_io_format' generator returns 1 dict object @ time dataset (so packet dict).
here error:
traceback (most recent call last): file "pdml_parser.py", line 35, in write_data dump(packet, out_stream) file "/.../anaconda3/lib/python3.5/tarfile.py", line 2397, in __exit__ self.close() file "/.../anaconda3/lib/python3.5/tarfile.py", line 1733, in close self.fileobj.close() file "/.../anaconda3/lib/python3.5/tarfile.py", line 459, in close self.fileobj.write(self.buf) typeerror: string argument expected, got 'bytes'
after troubleshooting comments, error caused when 'with' statement exits, , tries call context manager __exit__. believe in turn calls tarfile.close(). if remove tarfile.open() call 'with' statement, , purposefully leave out tarfile.close(), code:
with stringio() out_stream: tarfile.open(filename, 'w|gz', out_stream) tar_file: packet in json_io_format(data): dump(packet, out_stream)
this version of program completes, not produce output file 'filname' , yields error:
exception ignored in: <bound method _stream.__del__ of <targile._stream object @ 0x7fca7a352b00>> traceback (most recent call last): file "/.../anaconda3/lib/python3.5/tarfile.py", line 411, in __del__ self.close() file "/.../anaconda3/lib/python3.5/tarfile.py", line 459, in close self.fileobj.write(self.buf) typeerror: string argument expected, got 'bytes'
i believe caused garbage collector. preventing tarfile object closing.
can me figure out going on here?
why think can write tarfile stringio? doesn't work think does.
this approach doesn't error, it's not how create tarfile in memory in-memory objects.
from json import dumps io import bytesio import tarfile data = [{'foo': 'bar'}, {'cheese': none}, ] filename = 'fnord' bytesio() out_stream, tarfile.open(filename, 'w|gz', out_stream) tar_file: packet in data: out_stream.write(dumps(packet).encode())
Comments
Post a Comment