8.0 Files

Syllabus :

8.1 Reading and Writing Files: Files and File Paths, The os.path Module, The File Reading/Writing
Process, Saving Variables with the shelve Module,Saving Variables with the print.format()
Function, Project: Generating Random Quiz Files, Project: Multiclipboard,

8.2 Organizing Files: The shutil Module, Walking a Directory Tree, Compressing Files with the
zipfile Module, Project: Renaming Files with American-Style Dates to European-Style
Dates,Project: Backing Up a Folder into a ZIP File,

8.3 Debugging: Raising Exceptions, Getting the Traceback as a String, Assertions, Logging, IDLE‟s
Debugger.

Reference : Automate the Boring Stuff with Python

8.1 Reading and Writing Files

A file has two key properties: a filename (usually written as one word) and a path. The path specifies the location of a file on the computer.

On Windows, paths are written using backslashes (\) as the separator between folder names. In Python (/ or \\) .The macOS and Linux operating systems, however, use the forward slash (/) as their path separator. If you want your programs to work on all operating systems, you will have to write your Python scripts to handle both cases.

For example, the following code joins names from a list of filenames to the end name:

Similarly, the / operator that we normally use for division can also combine Path objects and strings. This is helpful for modifying a Path Path() function.

Example: The Current working Directory

Example : The Home Directory

All users have a folder for their own files on the computer called the home directory or home folder. You can get a Path object of the home folder by calling Path.home():

Absolute vs. Relative Paths

There are two ways to specify a file path: 1.An absolute path, which always begins with the root folder and 2) A relative path directory . There are also the dot (.) and dot-dot (..) folders. These are not real folders but special names that can be used in a path. A single period (“dot”) for a folder name is shorthand for “this directory”. Two periods (“dot – dot”) means “the parent folder

Creating New Folders Using the os.makedirs() Function

Getting the Parts of a File Path

Given a Path object, you can extract the file different parts as strings using several Path object attributes. These can be useful for constructing new file paths based on existing ones. The parts of a file path include the following:

1. The anchor, which is the root folder of the filesystem 
2. On Windows, the drive, which is the single letter that often denotes a physical hard drive or other storage device 
3. The parent, which is the folder that contains the file 
4. The name of the file, made up of the stem (or base name) and the suffix (or extension)

Example 1 : Getting the parts of a file

The parents attribute (which is different from the parent attribute) evaluates to the ancestor folders of a Path object with an integer index:

Example 2 :

The os.path Module

The os.pathmodule contains many helpful functions related to filenames and file paths. Handling Absolute and Relative Paths :

1. The os.path module provides functions for returning the absolute path of a relative path and for checking whether a given path is an absolute path.

2. Calling os.path.abspath(path) will return a string of the absolute path of the argument. This is an easy way to convert a relative path into an absolute one.

3. Calling os.path.isabs(path) will return True if the argument is an absolute path and False if it is a relative path.

4. Calling os.path.relpath(path, start) will return a string of a relative path from the start path to path. If start is not provided, the current working directory is used as the start path.

Example : Os.Path

Dir Name and Base Name

Example : Dir Name and Base Name Together

Finding File Sizes and Folder Contents

Calling os.path.getsize(path) will return the size in bytes of the file in the path argument. Calling os.listdir(path) will return a list of filename strings for each file in the path argument. (Note that this function is in the os module, not os.path.)

Total size of the Folder

Checking Path Validity

•Assuming that a variable p holds a Path object, you could expect the following:

•Calling p.exists() returns True if the path exists or returns False

•Calling p.is_file() returns True if the path exists and is a file, or returns False otherwise.

•Calling p.is_dir() returns True if the path exists and is a directory, or returns False otherwise.

THE FILE READING/ WRITING PROCESS

1.Plaintext Files

2.Binary Files

Plain text Files

•Plaintext files contain only basic text characters and do not include font, size, or color information.

•Text files with the .txt extension or Python script files with the .py extension are examples of plaintext files.

•These can be opened with Notepad or application.

•Your programs can easily read the contents of plaintext files and treat them as an ordinary string value.

Binary files

•Binary files are all other file types, such as word processing documents, PDFs, images, spreadsheets, and executable programs.

•If you open a binary file in Notepad or TextEdit, it will look like scrambled nonsense, like in Figure.

Read Text and Write Text

  1. The pathlib read_text() method returns a string of the full contents of a text file.
  2. Its write_text() method creates a new text file (or overwrites an existing one)
    with the string passed to it.

Three steps to reading or writing files in Python:

  1. Call the open() function to return a File object.
  2. Call the read() or write() method on the File object.
  3. Close the file by calling the close() method on the File object.

1.Opening a File

Reading the Contents of a File

Using read()

using readline( )

using readlines( )

using read()

helloFile2 = open(‘C:\Users\thyagu\18CS55\Hello1.txt’)

using Read and Readlines

Writing to Files

To write into file open the file in write mode and append mode.
• Write mode (w) will overwrite the existing file and start from scratch
• Append mode (a) will append text to the end of the existing file.
• If the filename passed to open() does not exist, both write and append mode will create a new, blank file.
• After reading or writing a file, call the close() method before opening the file again

Saving Variables with the shelve Module

You can save variables in your Python programs to binary shelf files using the shelve module. This way, your program can restore data to variables from the hard drive. The shelve module will let you add Save and Open features to your program. For example, if you ran a program and entered some configuration settings, you could save those settings to a shelf file and then have the program load them the next time it is run.

Example :

Example :

Saving Variables with the pprint.pformat() Function

Using pprint.pformat() will give you a string that you can write to a .py file. This file will be your very own module that you can import whenever you want to use the variable stored in it.
For example, enter the following into the interactive shell:

Here, we import pprint to let us use pprint.pformat(). We have a list of dictionaries, stored
in a variable cats. To keep the list in cats available even after we close the shell, we use pprint.pformat() to return it as a string. Once we have the data in cats as a string . It is easy to write the string to a File , which we call myCats.py.

The benefit of creating a .py file (as opposed to saving variables with the shelve module) is that because it is a text file, the contents of the file can be read and modified by anyone with a simple text editor. For most applications, however, saving data using the shelve module is the preferred way to save variables to a file. Only basic data types such as integers, floats, strings, lists, and dictionaries can be written to a file as simple text. File objects, for example, cannot be encoded as text.

8.2 Organizing Files

Making copies of all PDF files (and only the PDF files) in every subfolder of a folder. Removing the leading zeros in the filenames for every file in a folder of hundreds of files named spam001.txt, spam002.txt, spam003.txt, and so on Compressing the contents of several folders into one ZIP file (which could be a simple backup system).

THE SHUTIL MODULE

The shutil (or shell utilities) module has functions to let you copy, move, rename, and
delete files in your Python programs. To use the shutil functions, you will first need to use
import shutil.

Copying Files and Folders using shutil

Moving and Renaming Files and Folders

WALKING A DIRECTORY TREE

Say you want to rename every file in some folder and also every file in every subfolder
of that folder. That is, you want to walk through the directory tree, touching each file as
you go. Writing a program to do this could get tricky; fortunately, Python provides a
function to handle this process for you. Here is an example program that uses the os.walk() function on the directory tree

The os.walk() function is passed a single string value: the path of a folder. You can use
os.walk() in a for loop statement to walk a directory tree, much like how you can use the
range() function to walk over a range of numbers. Unlike range(), the os.walk() function will
return three values on each iteration through the loop:
A string of the current folder
A list of strings of the folders in the current folder

A list of strings of the files in the current folder

When you run this program, it will output the following:

——————————————

Since os.walk() returns lists of strings for the subfolder and filename variables, you can
use these lists in their own for loops.

COMPRESSING FILES WITH THE ZIPFILE MODULE

You may be familiar with ZIP files (with the .zip file extension), which can hold the
compressed contents of many other files. Compressing a file reduces its size, which is
useful when transferring it over the internet. And since a ZIP file can also contain
multiple files and subfolders, a handy way to package several files into one. This
single file, called an archive file, can then be, say, attached to an email.
Your Python programs can create and open (or extract) ZIP files using functions in
the zipfile module.

Reading Zip File

To read the contents of a ZIP file, first you must create a ZipFile object (note the capital
letters Z and F). To create a ZipFile object, call the zipfile.ZipFile() function, passing it a string of the .ZIP filename. Note that zipfile is the name of the Python module, and ZipFile() is the name of the function. For example, enter the following into the interactive shell:

A ZipFile object has a namelist() method that returns a list of strings for all the files and
folders contained in the ZIP file. These strings can be passed to the getinfo() ZipFile
method to return a ZipInfo object about that particular file. ZipInfo objects have their own
attributes, such as file_size and compress_size in bytes, which hold integers of the original
file size and compressed file size, respectively. While a ZipFile object represents an entire
archive file, a ZipInfo object holds useful information about a single file in the archive.

Extracting from ZIP Files

The extractall() method for ZipFile objects extracts all the files and folders from a ZIP file
into the current working directory.

Creating and Adding to ZIP Files

To create your own compressed ZIP files, you must open the ZipFile object in write mode
by passing ‘w’ as the second argument.

This code will create a new ZIP file named new.zip that has the compressed contents
of spam.txt.

8.3 Debugging:

The debugger is a feature that executes a program one instruction at a time, giving you a chance to inspect the values in variables while your code runs, and track how the values change over the
course of your program. This is much slower than running the program at full speed, but
it is helpful to see the actual values in a program while it runs, rather than deducing what
the values might be from the source code.

RAISING EXCEPTIONS

Python raises an exception whenever it tries to execute invalid code.

Exceptions are raised with a raise statement. In code, a raise statement consists of the
following:
The raise keyword
A call to the Exception() function
A string with a helpful error message passed to the Exception() function

For example, enter the following into the interactive shell

Example : For example, open a new file editor tab, enter the following code, and save the program as boxPrint.py

Output :

Using the try and except statements, you can handle errors more gracefully instead of
letting the entire program crash.

GETTING THE TRACEBACK AS A STRING

When Python encounters an error, it produces a treasure trove of error information called
the traceback. The traceback includes the error message, the line number of the line that
caused the error, and the sequence of the function calls that led to the error. This
sequence of calls is called the call stack.

Output :

Writing traceback information to log file

Instead of crashing your program right when an exception occurs, you can write the traceback information to a text file and keep your program running.

ASSERTIONS

An assertion is a sanity check to make sure your code isn’t doing something obviously
wrong. These sanity checks are performed by assert statements. If the sanity check fails,
then an AssertionError exception is raised. In code, an assert statement consists of the
following:
1. The assert keyword
2. A condition (that is, an expression that evaluates to True or False)
3. A comma
4. A string to display when the condition is False

Using an Assertion in a Traffic Light Simulation

The data structures of the stoplights at an intersection is a dictionary with keys ‘ns’ and ‘ew’, for the stoplights facing north-south and east-west, respectively. The values at these keys will be one of the strings ‘green’, ‘yellow’, or ‘red’. The code would look something like this:

Logging :

Logging is a way to understand happening in your program and in what order it is happening . Python’s logging module makes it easy to create a record of custom messages.

Using the logging Module : To enable the logging module to display log messages on your screen as your program runs, copy the following to the top of your program :

Example :

Output :

Logging Levels :

Logging levels provide a way to categorize your log messages by importance. There are
five logging levels, described in Table from least to most important. Messages can
be logged at each level using a different logging function.

LevelLogging FunctionDescription
DEBUGlogging.debug()The lowest level. Used for small details. Usually you care about these messages only when diagnosing problems.
INFOlogging.info()Used to record information on general events in your program or confirm that things are working at their point in the program.
WARNINGlogging.warning()Used to indicate a potential problem that doesn’t prevent the program from working but might do so in the future.
ERRORlogging.error()Used to record an error that caused the program to fail to do something.
CRITICALlogging.critical()The highest level. Used to indicate a fatal error that has caused or is about to cause the program to stop running entirely.
Reference : https://automatetheboringstuff.com/chapter10/

The benefit of logging levels is that you can change what priority of logging message you want to see. Passing logging.DEBUG to the basicConfig() function’s level keyword argument will show messages from all the logging levels (DEBUG being the lowest level). But after developing your program some more, you may be interested only in errors. In that case, you can set basicConfig()’s level argument to logging.ERROR. This will show only ERROR and CRITICAL messages and skip the DEBUG, INFO, and WARNING messages.

Disabling Logging

The logging.disable() function disables all log messages so that there is no need to remove remove all the logging calls by hand. To disable logging entirely, just add logging.disable (logging.CRITICAL) to your program. For example, enter the following into the interactive shell:

Logging to a File

Instead of displaying the log messages to the screen, you can write them to a text file.
The logging.basicConfig() function takes a filename keyword argument, like so:

IDLE‟s Debugger.

IDLE has a debugger built into it. It is very useful for stepping through a program and watching the variables change values. Start IDLE and open the program source file as illustrated below:

In the Shell window, click on the Debug menu option at the top and then on Debugger. You will see a “Debug Control” window like this

For the debugger to be most useful, you need to set a breakpoint in your source code before you start running the program. RIGHT click on a line of your source and choose “set breakpoint”.

The background of the line you click on turns yellow to show the line marked with the breakpoint

Now run the program with F5 as usual.

PRACTICE QUESTIONS

  1. What are the key properties of a file ? Explain in detail file reading /writing process with an example of python program
  2. Explain os.path module and the functions of shutil module with examples.
  3. Describe the difference between Python OS and OS.path modules.Also discuss the following methods of os module a. chdir() b. rmdir() c. walk() d. listdir() e. getcwd()
  4. Demonstrate the copy , move ,rename and delete functions of shutil module with Python code snippet
  5. How do we specify and handle absolute , relative paths?
  6. Explain saving of variables using shelve module.
  7. With code snippet , explain reading , extracting and creating ZIP files.
  8. What is the difference between the read() and readlines() methods? In
    C:\bacon\eggs\spam.txt, which part is the dir name, and which part is the base name?
  9. What are the three “mode” arguments that can be passed to the open()function?
  10. With a code snippet explain saving variables using the shelve module and pprint.pformat()
    functions.
  11. ZipFile objects have a close() method just like File objects’ close() method. What
    ZipFile method is equivalent to File objects’ open() method?
  12. What do the os.getcwd() and os.chdir() functions do? What are the . and .. folders?
  13. What is the difference between shutil.copy() and shutil.copytree().
  14. Define assertions. What does an assert statement in python consist of ? Explain how assertions can be used in traffic light simulation with Python code snippet.
  15. Explain in briefly what are the different methods of file operations supports in python shutil module.
  16. . Write a python program to create a folder PYTHON and under the hierarchy 3 files file1,file2 and file3. write the content in file1 as “TECH” and in file2 as “UNIVERSITY” and file3 content should be by opening and merge of file1 and file2. Check out the necessary condition before write file3.
  17. What is a relative path relative to?
  18. What does an absolute path start with?
  19. What does Path(‘C:/Users’) / ‘Al’ evaluate to on Windows?
  20. What does ‘C:/Users’ / ‘Al’ evaluate to on Windows?
  21. What do the os.getcwd() and os.chdir() functions do?
  22. What are the . and .. folders?
  23. In C:\bacon\eggs\spam.txt, which part is the dir name, and which part is the base
    name?
  24. What are the three “mode” arguments that can be passed to open() function?
  25. What happens if an existing file is opened in write mode?
  26. What is the difference between the read() and readlines() methods?
  27. What data structure does a shelf value resemble?
  28. What is the difference between shutil.copy() and shutil.copytree()?
  29. What function is used to rename files?
  30. What is the difference between the delete functions in the send2trash and shutil
    modules?
  31. ZipFile objects have a close() method just like File close() method. What ZipFile
    method is equivalent to File open() method?
  32. Write an assert statement that triggers an AssertionError if the variable spam is an integer less than 10. 2. Write an assert statement that triggers an AssertionError if the variables eggs and bacon contain strings that are the same as each other, even if their cases are different (that is, ‘hello’ and ‘hello’ are considered the same, and ‘goodbye’ and ‘GOODbye’ are also considered the same).
  33. Write an assert statement that always triggers an AssertionError.
  34. What are the two lines that your program must have in order to be able to call logging.debug()?
  35. What are the two lines that your program must have in order to have logging.debug()
    send a logging message to a file named programLog.txt?
  36. What are the five logging levels?
  37. What line of code can you add to disable all logging messages in your program?
  38. Why is using logging messages better than using print() to display the same message?
  39. What are the differences between the Step Over, Step In, and Step Out buttons in the
    debugger?
  40. After you click Continue, when will the debugger stop?
  41. What is a breakpoint?
  42. How do you set a breakpoint on a line of code in IDLE?