Python Modules, Packages & Object Oriented Approach

Python mindmap for modules, packages and object-oriented approach

Kom i gang. Det er Gratis
eller tilmeld med din email adresse
Python Modules, Packages & Object Oriented Approach af Mind Map: Python Modules, Packages & Object Oriented Approach

1. Python random module

1.1. implements pseudo-random number generators

1.1.1. algorithms aren't random - they are deterministic and predictable

1.1.2. A random number generator takes a value called a seed, treats it as an input value, calculates a "random" number based on it (the method depends on a chosen algorithm) and produces a new seed value

1.1.2.1. The initial seed value determines the order in which the generated values will appear

1.1.2.2. if you set the seed value to a fixed value and make a sequence of calls to the random number generator, the "random" numbers produced by that post-seed sequence are always reproducible if you repeat with the same seed

1.1.2.3. using a number derived from the current system date and time is a commonly used source for a seed number because it always produces a different set of random numbers due to never repeating the seed

1.2. seed()

1.2.1. sets seed value

1.2.2. argument is optional, but if supplied it takes an integer or converts to an integer

1.2.2.1. note: even seed('hello') will work, as 'hello' string is converted to an integer

1.2.2.2. without argument, the current system datetime is converted to integer and used

1.2.3. you don't have to explicitly set seed before using one of the random number generator functions, and in this case the system datetime will automatically be used to default the seed at the time the module is imported

1.3. random()

1.3.1. returns next random float number between 0.0 and 1.0

1.4. randrange(end)

1.4.1. return random integer between 0 and end minus 1

1.4.1.1. e.g. randrange(5) returns random integer between 0 and 4

1.5. randrange(begin,end)

1.5.1. return random integer between begin and end minus 1

1.6. randrange(begin,end, step)

1.6.1. return random integer between begin and end minus 1 in steps of step

1.7. randint(left,right)

1.7.1. return random integer between left and right

1.7.1.1. e.g. randint(1,5) returns random integer between 1 and 5

1.8. choice(sequence)

1.8.1. return random element from a sequence, such as a list of numbers

1.8.1.1. if you use with a for loop and the list .remove method on each iteration, you can use this like a lottery draw

1.9. sample(sequence, elements_to_choose=1)

1.9.1. return a list of length elements_to_choose (which defaults to 1 if omitted) drawn in random order from a sequence, such as a list

1.9.2. elements_to_choose cannot exceed length of sequence, otherwise an exception is raised

2. Python platform module

2.1. think of your code executing at the top of a pyramid: 1. Code 2. Python runtime environment 3. OS 4. Hardware (device drivers, etc.)

2.1.1. opening a file, for example, is an instruction that goes from your code to the Python runtime environment, which handles the OS instruction, and the OS understands how to interact with the hardware for the required disk reads into memory, etc.

2.2. platform(alias = False, terse = False)

2.2.1. returns info about the platform that the Python runtime is hosted on

2.2.1.1. e.g. Windows-10

2.3. machine()

2.3.1. returns generic name of processor

2.3.1.1. e.g. AMD64

2.4. processor()

2.4.1. returns real processor name if possible

2.4.1.1. e.g. Intel64 Family 6 Model 78 Stepping 3, GenuineIntel

2.5. system()

2.5.1. returns generic name of OS

2.5.1.1. e.g. Windows

2.6. version()

2.6.1. returns version of OS

2.6.1.1. e.g. 10.0.18362

2.7. python_implementation()

2.7.1. returns Python implementation

2.7.1.1. e.g. CPython

2.7.2. returns Python implementation

2.8. python_version_tuple()

2.8.1. returns major version, minor version and patch level as 3-element tuple

2.8.1.1. e.g. ('3', '8', '1')

3. Python Standard Module index

3.1. There are many modules, which collectively make up the Python universe, and pure Python is like a single galaxy within that universe

3.2. The idea is to find specific modules for what you need to do and then learn how to use them

4. Python creating a module

4.1. Observation based on experiment

4.1.1. module.py is empty file representing a module

4.1.2. main.py is file in same directory as module.py and includes a single line: import module.py

4.1.3. when you run main.py for first time, it produces some effects on the file system

4.1.3.1. a __pycache__ subdirectory is created

4.1.3.2. file is created inside __pycache__ subdirectory, named with following convention: <module_name>.<python_disribution>.xy.pyc, where x is major version no and y is minor version no

4.1.3.2.1. e.g. module.cpython-36.pyc

4.1.3.3. .pyc file contains semi-compiled code, optimised for execution by Python interpreter

4.1.3.3.1. makes module code faster to load and run next time

4.1.3.3.2. Python automatically tracks changes to source module and rebuilds .pyc file when required

4.2. Running import statement for module file automatically creates a variable labelled __name__

4.2.1. __name__ variable returns two different values depending on execution context

4.2.1.1. when code execution is inside module file itself, __name__ returns '__main__'

4.2.1.2. when code execution is outside module file (i.e. you are referencing it as <module_name>.__name__ having previously executed import <module_name>, it will return '<module_name>'

4.2.1.3. this can be used to check execution context and develop appropriate conditional logic based on that

4.2.1.3.1. For example, as modules are generally collections of functions, designed for import and not to be executed as a standalone file, you might add some logic based on __name__ to print some helpful message should someone decide to execute the module file directly

5. Python sys module

5.1. path variable

5.1.1. holds list of paths that are searched when running the import statement

5.1.1.1. Python supports reading zip files as directories for modules, which helps save a lot of disk space

5.1.2. appending to or inserting into sys.path list variable is how you can store usable modules in different sub-directories, distinct from program files that use them

6. Python exceptions

6.1. When code is syntactically correct but results in an error, two things happen: 1. Program execution is halted 2. An exception object is created

6.1.1. known as raising an exception

6.1.2. if code does not handle exception, then program will be forcibly terminated

6.1.3. Python interpreter returns name of exception in its error message when not handled

6.1.3.1. example:

6.1.3.1.1. ZeroDivisionError: division by zero

6.1.3.2. part before colon is name of exception

6.1.4. Exception handling for all risky code:

6.1.4.1. try: <try block> except exc1: <exception block for exc1> except exc2: <exception block for exc2> except: <catch all exception block>

6.2. Rules for try-except

6.2.1. the except branches are searched in the same order in which they appear in the code

6.2.2. you must not use more than one except branch with a certain exception name

6.2.3. the number of different except branches is arbitrary - the only condition is that if you use try, you must put at least one except (named or not) after it

6.2.4. the except keyword must not be used without a preceding try

6.2.5. if any of the except branches is executed, no other branches will be visited

6.2.6. if none of the specified except branches matches the raised exception, the exception remains unhandled

6.2.7. if an unnamed except branch exists (one without an exception name), it has to be specified as the last

6.2.8. it is also possible to extend a standard try-except block with an else branch, which if added MUST follow all except branches and will only be executed if no exception arises from the try block at the top

6.2.8.1. it is also possible to extend a try-catch block with a finally branch, which if added MUST be the last branch (i.e. after all except branches and the else branch, if the latter exists), and unlike else, the finally branch will always execute regardless of whether or not an exception arose from the try block

6.2.8.1.1. example

6.3. Python 3 defines 63 built-in exceptions, which form a hierarchy

6.3.1. Example: ZeroDivisionError

6.3.1.1. is a more specific exception of type ArithmeticError

6.3.1.1.1. ArithmeticError is more specific exception of type Exception

6.3.2. significance of hierarchy is that your try-except block can handle exceptions at any level from most specific to most general

6.3.2.1. example: the following two code fragments are semantically equivalent because ArithmeticError is a general form of the specific ZeroDivisionException that the code is triggering

6.3.2.1.1. try: y = 1 / 0 except ZeroDivisionError: print("Oooppsss...") print("THE END.")

6.3.2.1.2. try: y = 1 / 0 except ArithmeticError: print("Oooppsss...") print("THE END.")

6.3.2.2. Avoid adding exception handlers for more general exceptions before more specific ones in the same hierarchy - this will make the more specific exception handlers useless as their code is unreachable

6.3.3. you can also include multiple exceptions in single except block

6.3.3.1. exceptions must be comma separated and enclosed in brackets ( )

6.3.3.1.1. example

6.4. You can also manually trigger exceptions by using the raise keyword with the name of a built-in exception

6.4.1. e.g. raise ZeroDivisionError

6.4.2. it's also possible to use raise without naming an exception, but this is only valid from inside an except block

6.4.2.1. can be useful for distributing exception handling across your code

6.4.2.2. example:

6.4.2.2.1. def badFun(n): try: return n / 0 except: print("I did it again!") raise try: badFun(0) except ArithmeticError: print("I see!") print("THE END.")

6.5. You can use the assert statement as a fail-safe check, which will raise an exception of type AssertionError if the expression following the assert keyword does not resolve to True

6.5.1. example usage:

6.5.1.1. import math x = float(input("Enter a number: ")) assert x >= 0.0 x = math.sqrt(x) print(x)

6.5.1.1.1. if x >= 0.0 does not resolve to True it will raise AssertionError, otherwise it will do nothing

6.5.2. assert will also raise AssertionError exception for the following results: number equating to zero empty string None

6.6. Common exceptions in hierarchy:

6.6.1. BaseException

6.6.1.1. Exception

6.6.1.1.1. ArithmeticError

6.6.1.1.2. AssertionError

6.6.1.1.3. LookupError

6.6.1.1.4. MemoryError

6.6.1.1.5. StandardError

6.6.1.2. KeyboardInterrupt

6.6.1.2.1. concrete exception raised when the user uses a keyboard shortcut designed to terminate a program's execution (Ctrl-C in most OSs); if handling this exception doesn't lead to program termination, the program continues its execution

6.6.1.3. most general (abstract) of all Python exceptions - all other exceptions are included in this one; it can be said that the following two except branches are equivalent: except: and except BaseException:

6.7. Exceptions are classes and when an exception is raised via a try block, this will create an object (i.e. an instance of one of the classes in the class hierarchy that begins with BaseException)

6.7.1. the following simple example demonstrates how we can access information about a captured exception object - note the use of the "as" keyword followed by the alias (e in this example)

6.7.1.1. try: i = int("Hello!") except Exception as e: print(type(e).__name__) print(e.__str__()) i = None print("i =",i)

6.7.1.1.1. returns:

6.8. Python custom exceptions

6.8.1. you can define your own custom exception classes - this can be an specialised extension of a more specific exception class or if you want to create your own very particular hierarchy, you can use the high level Exception class as your top level superclass

6.8.1.1. example:

6.8.1.1.1. class PizzaError(Exception): def __init__(self, pizza = "unknown", message = ""): Exception.__init__(self, message) self.pizza = pizza def __str__(self): return "PizzaError" class TooMuchCheeseError(PizzaError): def __init__(self, pizza = "unknown", cheese = ">100", message = ""): PizzaError.__init__(self, pizza, message) self.cheese = cheese def __str__(self): return "TooMuchCheeseError"

7. Python string list sorting

7.1. Python sorted() function

7.1.1. takes single list argument and returns a new list with all elements sorted

7.2. Python list sort() method

7.2.1. sorts and modifies source list (i.e. does not return copy, but actually changes the subject list)

8. Python generator

8.1. a generator is a special type of function that produces multiple outputs and returns these encapsulated inside an iterable object

8.2. range() is an example of a generator

8.2.1. e.g. range(5) produces 5 values, 0 to 4 and returns an object that can be iterated by a for loop

8.3. a generator can also be a class that provides two methods: __iter__(), __next__()

8.3.1. __iter__() method returns the object and is invoked once

8.3.2. __next__() should return the next value (first, second, and so on) of the desired series - it will be invoked by the for/in statements in order to pass through the next iteration; if there are no more values to provide, the method should raise the StopIteration exception

8.3.3. example:

8.3.3.1. class Fib: def __init__(self, nn): print("__init__") self.__n = nn self.__i = 0 self.__p1 = self.__p2 = 1 def __iter__(self): print("__iter__") return self def __next__(self): print("__next__") self.__i += 1 if self.__i > self.__n: raise StopIteration if self.__i in [1, 2]: return 1 ret = self.__p1 + self.__p2 self.__p1, self.__p2 = self.__p2, ret return ret for i in Fib(10): print(i)

8.3.3.1.1. results show that constructor __init__() runs first, then __iter__(), then __next__() is called repeatedly, and the final time is the StopIteration exception that halts the iteration process but is gracefully handled

8.3.3.1.2. note that Fib uses recursion to call itself

8.3.3.1.3. the Fib object conforms to the iterator protocol - otherwise the for..in construct would raise an exception

8.4. Iterator protocol

8.4.1. way in which an object should behave to conform to the rules imposed by the context of the for and in statements

8.4.2. An object conforming to the iterator protocol is called an iterator

8.5. Python yield statement

8.5.1. The iterator protocol is rather inconvenient (as linked example shows, code is longer and harder to comprehend), which leads to the yield statement, which can be likened to a special form of the return statement

8.5.1.1. example:

8.5.1.1.1. def fun(n): for i in range(n): yield i for v in fun(5): print(v)

8.5.2. using yield instead of return converts a function to a generator, which yields a generator object that is iterable

8.5.3. as well as using a generator in a regular for loop, we can also use it in list comprehension

8.5.3.1. example

8.5.3.1.1. def powersOf2(n): pow = 1 for i in range(n): yield pow pow *= 2 t = [x for x in powersOf2(5)] print(t)

8.5.4. list() function can take a generator as its argument and convert it into a regular list

8.5.4.1. example

8.5.4.1.1. def powersOf2(n): pow = 1 for i in range(n): yield pow pow *= 2 t = list(powersOf2(3)) print(t)

8.5.5. we can use a generator with the in operator in place of a regular list

8.5.5.1. example

8.5.5.1.1. def powersOf2(n): pow = 1 for i in range(n): yield pow pow *= 2 for i in range(20): if i in powersOf2(4): print(i)

8.6. Python list comprehension to generator

8.6.1. in addition to a list comprehension using a generator (such as range() ), you can tweak any list comprehension expression so that it yields a generator rather than a list

8.6.1.1. the only change you have to make is to replace the square brackets [ ] of the list comprehension to regular paretheses ( )

8.6.1.1.1. example

9. Python closures

9.1. Closures provide an alternative for classes that would typically only be created with one method.

9.2. They avoid the use of global variables and provide a form of data hiding.

9.3. Following criteria must be met to create closure in Python:

9.3.1. 1. Must have a nested function

9.3.2. 2. Nested function must refer to value defined in enclosing function

9.3.3. 3. Enclosing function must return nested function

9.4. example

9.4.1. def makeclosure(par): loc = par def power(p): return p ** loc return power fsqr = makeclosure(2) fcub = makeclosure(3) for i in range(5): print(i, fsqr(i), fcub(i))

9.4.1.1. returns

9.4.1.1.1. 0 0 0 1 1 1 2 4 8 3 9 27 4 16 64

9.4.1.2. Note that power() function references variable loc, defined by makeclosure.

9.4.1.3. Note that makeclosure returns copy of power() function by using return statement and name of function WITHOUT parentheses.

9.4.1.4. Note that we can capture copies of the power() function with the value of loc "locked in" so to speak, assign these to variables fsqr and fcub respectively and then invoke fsqr and fcub as functions (separate copies of power() each with different fixed values for loc).

10. Python Modules

10.1. Decomposition

10.1.1. breaking down code into smaller self contained parts

10.2. Managing code size and complexity

10.3. Think of a module as a book, folders as shelves and folder collections as libraries, where each book's chapters consist of functions, variables, classes and objects

10.4. Python standard library

10.4.1. Modules and built-in functions that come included with a Python distribution

10.4.2. Includes modules written in C that provide access to file system

10.5. Python import module

10.5.1. use import keyword + name of module

10.5.2. you can import multiple modules in single import statement by comma separating module names

10.5.2.1. e.g. import math, sys

10.5.3. import statement can be anywhere in code but must come before first invocation

10.6. Python namespace

10.6.1. An analogy is a social group where everyone is known by a unique name, perhaps making use of nicknames to ensure unique identification

10.6.2. when you import a module, this is a source file that will have a bunch of associated names that will become known within your code, but by default they won't override any names in your code and must be accessed by prefixing the module name - e.g. math.pi (where math is a module and pi is a constant defined inside that module)

10.6.2.1. import <module>

10.6.2.1.1. all names in that module are accessible but qualifying with module name prefix is mandatory

10.6.2.2. from <module> import <name(s)>

10.6.2.2.1. all names imported are accessible without module name qualification

10.6.2.2.2. <name(s)> can be comma separated list

10.6.2.2.3. e.g. from math import sin, pi

10.6.2.2.4. overrides any pre-existing names, but equally names can be defined in your code after the import and those definitions then override

10.6.2.3. from <module> import *

10.6.2.3.1. Imports all entities from a module

10.6.2.3.2. Higher risk of name conflicts

10.6.2.3.3. Convenient but considered bad practice for regular code

10.6.2.4. import <module> as <alias>

10.6.2.4.1. Imports module and assigns alias, which you use in qualifying references to that module's entities

10.6.2.4.2. note that "as" is a keyword

10.6.2.4.3. after successful aliased import, the original module name cannot be used

10.6.2.5. from <module> import <name> as <alias>

10.6.2.5.1. Imports specific entity (name) with an alias

10.6.2.5.2. <name> as <alias> can repeat in single statement with comma separations

10.7. Python dir() function

10.7.1. Built in function you can use after import <module> to list alphabetically all the entities available from the imported module

10.7.2. example:

10.7.2.1. import math dir(math)

10.7.3. if you import module with an alias, then you must use alias with dir()

11. Python math module

11.1. sin(x)

11.1.1. sine of x

11.2. cos(x)

11.2.1. cosine of x

11.3. tan(x)

11.3.1. tangent of x

11.4. asin(x)

11.4.1. arcsine of x

11.5. acos(x)

11.5.1. arccosine of x

11.6. atan(x)

11.6.1. archtangent of x

11.7. pi

11.7.1. constant that approximates pi value

11.8. radians(x)

11.8.1. converts x from degrees to radians

11.9. degrees(x)

11.9.1. converts x from radians to degrees

11.10. e

11.10.1. constant that approximates Euler's number

11.11. exp(x)

11.11.1. e to the power of x

11.12. log(x)

11.12.1. natural logarithm of x

11.13. log(x, b)

11.13.1. logarithm of x to the power of b

11.14. log10(x)

11.14.1. decimal logarithm of x, more precise than log(x, 10)

11.15. log2(x)

11.15.1. binary logarithm of x, more precise than log(x, 2)

11.16. pow(x, y)

11.16.1. x to the power of y

11.16.2. note: this is a built-in function, not really part of math module, so no need to import math to use it

11.17. ceil(x)

11.17.1. ceiling ox x (smallest integer greater than or equal to x)

11.18. floor(x)

11.18.1. floor of x (largest integer less than or equal to x)

11.19. trunc(x)

11.19.1. value of x truncated to an integer

11.19.1.1. behaves like floor on positive numbers and ceil on negative numbers

11.20. factorial(x)

11.20.1. value of x!

11.20.1.1. x must be positive, otherwise exception raised

11.20.1.2. x must be unambiguously resolvable to a whole number, otherwise exception raised

11.20.1.2.1. ok

11.20.1.2.2. not ok

11.21. hypot(x, y)

11.21.1. returns length of hypotenuse for right-angled triangle with leg lengths of x and y

12. Python packages

12.1. Group together modules

13. Python module features

13.1. When you add variables in a module, there is no way in Python to keep that variable hidden or protected from unwanted changes by the module user.

13.1.1. Python module developers must trust their users not to mis-use the module variable.

13.1.2. There is a common convention to prefix "internal" variable names with an underscore "_" or double underscore "__". This is intended to communicate to the module user that its supposed to be an internal read-only variable.

13.2. The shabang or hashbang line that is often added to the top of a module file begins with "#!" is just a comment to Python but for Unix, Linux and MacOS it instructs the OS how to execute contents of the file.

13.2.1. Example is "#!/usr/bin/env python3", which would be common to see in python modules residing on Linux for example.

13.3. It is common practice to include a comment enclosed by triple quotes """ either side, which may well be a multi-line comment.

13.3.1. This explains the purpose of the module and is known as the doc-string.

13.3.1.1. Typically this will immediately follow the hashbang comment at the top of the module file.

14. Python packages

14.1. Packages are a collection of related modules organised into a hierarchical sub-directory collection in the host file system

14.2. A reference to a function in a module that is nested below the top level package directory is made using dot (.) notation to separate the sub-directory references

14.2.1. example: extra.good.best.tau.funT()

14.2.1.1. extra, good and best represent hierarchical sub-directories in the package

14.2.1.2. extra is the top level directory for the package

14.2.1.3. tau is a module (filename tau.py) located inside the best sub-directory

14.2.1.4. funT() is a function located inside the tau.py module

14.3. In order for Python to recognise that a particular collection of module files represents a package, initialisation is required

14.3.1. Package initialisation is achieved by placing a file with the following name in the top level directory for the package:

14.3.1.1. __init__.py

14.3.1.1.1. if you don't require any special initialisation for the package, this file can be empty but the file itself must exist

14.4. A common file structure for storing program files and packages is: packages programs

14.4.1. here is a common piece of code to have at the top of your program files given the aforementioned structure of parallel sub-directories named programs and packages

14.4.1.1. from sys import path path.append('..\\packages')

14.4.1.1.1. the double dot (..) steps back up 1 level in directory hierarchy (from programs), and the backslash is doubled because Python recognises \ as an escape character, so we must escape it for Python to treat as a sub-directory reference

15. Text Handling

15.1. Computers store characters as numbers

15.2. Common need to process character data across varied computer systems led to standardisation of character encoding systems

15.2.1. ASCII is one of the most popular, common standards, based on latin alphabet, allowing for 256 characters

15.2.1.1. Latin alphabet + numeric digits and various common whitespace (e.g. TAB, SPACE) and control (e.g. CR, LF) characters, plus a few commonly used symbols (e.g. $, !) are encoded within first 128 code points of ASCII (0 to 127)

15.2.1.2. ASCII leverages concept of code page for the upper 128 characters (128 to 255) to support needs for some other languages with similar alphabets

15.2.1.2.1. this means that single code point in 128 to 255 range in ASCII can return a different character depending on the code page that is being applied

15.2.2. ASCII is inadequate to support need for internationalization, which is a term that may be referred to as I18N (starts with "I" + 18 letters, ends with "N")

15.2.2.1. Code pages solved the I18N problem for a while but was recognised as imperfect, which led to the Unicode standard

15.2.2.1.1. Unicode assigns unique (unambiguous) characters (letters, hyphens, ideograms, etc.) to more than a million code points

15.2.2.1.2. First 128 characters of Unicode are identical to ASCII

15.2.2.1.3. first 256 Unicode code points are identical to the ISO/IEC 8859-1 code page (a code page designed for western European languages)

15.2.2.1.4. Unicode standard says nothing about how to code and store the characters in the memory and files. It only names all available characters and assigns them to planes (a group of characters of similar origin, application, or nature).

15.2.3. Each character (alphabetic, numeric, symbolic, whitespace or control) is represented in an encoding system by a code point, each of which has a unique number assigned to it by the encoding system

15.3. Strings in Python are immutable sequences

15.3.1. you can iterate them like lists and you can access individual characters via index references

15.3.1.1. examples

15.3.1.1.1. myString = 'hello world' for i in range(len(myString)): print(mystring[i], sep="", end="")

15.3.1.1.2. myString = 'hello world' for c in myString: print(c, sep="", end="")

15.3.2. slices work with strings too

15.3.2.1. alpha = "abdefg" print(alpha[1:3]) print(alpha[3:-2]) print(alpha[::2])

15.3.2.1.1. returns: bd e adf

15.3.3. you can use the in and not in operators with strings too, like lists

15.3.4. unlike lists, you cannot use del with an index reference to remove any part of a string, although you can use del on the whole string

15.3.4.1. it follows that unlike lists, strings do not have an append() or insert() method, so any attempt to use those with a variable holding a string will raise an exception

15.3.5. min() function works with strings and lists alike

15.3.5.1. example:

15.3.5.1.1. print(min("aAbByYzZ")) t = 'The Knights Who Say "Ni!"' print('[' + min(t) + ']') t = [0, 1, 2] print(min(t))

15.3.5.2. string argument cannot be an empty string, otherwise it will throw a ValueError exception

15.3.6. max() function works in opposite way to min()

15.3.7. index() method returns first index position of a substring passed as argument, but if substring not found it will raise ValueError exception

15.3.7.1. example

15.3.7.1.1. myString = 'Hello world' print(myString.index('w')) print(myString.index('Hell')) print (myString.index('World'))

15.3.8. list() function will convert a string to a list

15.3.8.1. example:

15.3.8.1.1. print(list("abcabc"))

15.3.9. count() method works the same for strings as for lists

15.3.9.1. example:

15.3.9.1.1. myString = "abcabc" myList = list(myString) print(myString.count('a')) print(myList.count('a'))

15.4. Multiline strings are specified using either 3 apostrophes ''' or 3 quotes """

15.4.1. examples

15.4.1.1. multiLine = '''Line #1 Line #2'''

15.4.1.2. multiLine = """Line #1 Line #2"""

15.4.2. note that this is illegal

15.4.2.1. multiLine = 'Line #1 Line #2'

15.4.3. note that len includes the whitespace characters

15.4.3.1. when you press enter in Python editor, you will get LF whitespace character added, which can also be denoted as \n

15.5. String data supports use of + and * operators, which is an example of overloading as they do not behave in same way as for arithmetic operations involving numbers

15.5.1. + performs string concatention

15.5.1.1. print('hello ' + 'world')

15.5.1.1.1. returns hello world

15.5.2. * performs string multiplication

15.5.2.1. print('a' * 3)

15.5.2.1.1. returns aaa

15.5.3. += and *= are both supported for string assignments

15.6. Python ord() function

15.6.1. takes 1 character string argument and returns the Unicode encoding number for the character

15.7. Python chr() function

15.7.1. opposite function of ord(), takes single integer argument and returns Unicode string character

15.8. Python string specific methods

15.8.1. capitalize() method

15.8.1.1. if character at index[0] of source string is a letter, capitalize it, and convert any other letters in string to lower case, with result being a new string

15.8.1.1.1. example:

15.8.2. center() method

15.8.2.1. one-parameter variant of the center() method makes a copy of the original string, trying to center it inside a field of a specified width

15.8.2.1.1. example:

15.8.2.1.2. if string length exceeds argument value, then copy of original string returned without any added spaces

15.8.2.2. two-parameter variant of center() makes use of the character from the second argument, instead of a space

15.8.2.2.1. example:

15.8.3. endswith() method

15.8.3.1. returns True if source string ends with substring passed as argument, else False

15.8.3.1.1. example:

15.8.3.2. startswith() is the mirror opposite of endswith(), returning True if source string starts with substring passed as argument, esle False

15.8.4. find() method

15.8.4.1. similar to index(), it looks for a substring and returns the index of first occurrence of this substring, but it's safer (returns -1 if substring not found rather than raising exception like index()) and works with strings only

15.8.4.1.1. example:

15.8.4.1.2. use 2-parameter variant to start search at some position beyond index 0

15.8.4.1.3. use 3-parameter variant to limit upper position of search

15.8.4.2. rfind() method

15.8.4.2.1. has 1, 2 and 3 parameter variants that work almost identically to find() but starts search from end of string and works back

15.8.5. isalnum() method

15.8.5.1. returns True if source string consists exclusively of numeric digits and/or alphabetic characters (letters), else False

15.8.5.1.1. example:

15.8.5.1.2. will return False if string contains any spaces

15.8.5.1.3. will return True for alphabets other than Western Latin

15.8.6. isalpha() method

15.8.6.1. more specialised than isalnum(), returning True only when all characters are alphabetic letters

15.8.7. isdigit() method

15.8.7.1. more specialised than isalnum(), returning True only when all characters are numeric digits

15.8.8. islower() method

15.8.8.1. more specialised than isalpha(), returning True only when all characters are lowercase alpha

15.8.9. isupper() method

15.8.9.1. opposite of islower(), returns True only when all characters are uppercase alpha

15.8.10. isspace() method

15.8.10.1. returns True when all characters are whitespace

15.8.10.1.1. example:

15.8.11. join() method

15.8.11.1. takes a single list of strings as an argument and uses the source string as a separator to combine all the list strings into a single new string

15.8.11.1.1. example:

15.8.11.1.2. if list argument does not hold exclusively string elements, it will raise a TypeError exception

15.8.12. lower() method

15.8.12.1. converts all uppercase letters in source string to lowercase and returns copy of transformed string

15.8.12.1.1. example:

15.8.12.2. swapcase() method transforms all lowercase letters to upper and all uppercase letter to lower, returning transformed string

15.8.12.3. title() method transforms first letter of every word to uppercase, and all other letters to lowercase

15.8.12.3.1. example:

15.8.12.4. upper() method does the mirror opposite of lower()

15.8.13. lstrip() method

15.8.13.1. with no argument, this removes all leading whitespace characters from source string and returns transformed copy

15.8.13.1.1. example:

15.8.13.2. with single string argument, it substitutes that string for the leading characters to be removed

15.8.13.2.1. example:

15.8.13.3. rstrip() method is same as lstrip(), with 0 and 1 parameter variants, but works from opposite end of string

15.8.13.3.1. actually, the substring argument works by examining rightmost part of source string for ANY combination of substring characters and strips all of them

15.8.13.4. strip() method combines lstrip() and rstrip() into one

15.8.13.4.1. examples:

15.8.14. replace() method

15.8.14.1. requires two substring parameters, searches for first substring in source string and replaces with second substring, and returns result as new string

15.8.14.1.1. example:

15.8.14.2. there is a 3 parameter variant, where the 3rd parameter is an integer limiting the number of replacments

15.8.14.2.1. example:

15.8.15. split() method

15.8.15.1. places all substrings found in source string into elements of a list that is returned

15.8.15.1.1. example:

15.8.15.2. assumes that whitespaces are substring delimiters

15.8.15.3. join() method performs opposite action of split(), where the source string would typically be a space or some other delimiter and the argument of split() is a list of strings

15.8.15.3.1. example:

15.9. Python string comparison operators

15.9.1. All the usual comparison operators that can be used with numbers can also be used with strings (== , != , > , >= , < , <=)

15.9.1.1. the main thing to remember when comparing strings is that the comparison is always based on the ordinal ASCII/Unicode value of each character

15.9.1.2. remember that uppercase letters all occupy lower ordinal code values in ASCII and Unicode

15.9.1.2.1. so... print("hello" > "Hello") returns: True

15.9.1.3. bear in mind that a longer string compared to a shorter string is always greater than the shorter one when the longer string holds identical characters at its beginning to the shorter one

15.9.1.3.1. so... print("alpha" < "alphabet") returns: True

15.9.1.4. remember that comparing numbers that are strings is done on same basis as alphabetic letter comparisons

15.9.1.4.1. so... print('10' == '010') returns: False

15.9.1.4.2. and... print('10' < '8') returns: True

15.9.1.5. comparing a string with a number is possible using == and != but its generally a bad idea to make any comparisons between strings and numbers

15.9.1.5.1. so... print('10' == 10) returns: False

15.9.1.5.2. if you try and use one of the other comparison operators, you'll get a TypeError exception

15.10. Converting strings to numbers and vice versa

15.10.1. str() function

15.10.1.1. always safe, can convert any numeric type to a string

15.10.2. int() function

15.10.2.1. converts string representation of integer to an integer, but if string does not represent integer, it will raise a ValueError exception

15.10.2.1.1. note: int will not convert a string that represents a float to an int, but will convert an actual float to int by rounding

15.10.3. float() function

15.10.3.1. converts string representation of number (float or int) to a float, but if string does not represent a number, it will raise a ValueError exception

16. Classes and Objects

16.1. Classes categorise by a grouping of characteristics

16.2. A class can have many sub classes and sub classes have super classes

16.2.1. Sub classes inherit from super classes

16.3. Objects are created from classes and automatically belong to a class hierarchy

16.3.1. Conceptually every object has a unique grouping of 3 attribute types: name (think "noun") properties (think "adjectives") actions (think "verbs")

16.4. Python classes

16.4.1. create class using class keyword followed by class name, colon and indented class definition (similar to how function is defined)

16.4.1.1. example:

16.4.1.1.1. class TheSimplestClass: pass

16.4.2. Python objects

16.4.2.1. once class is defined, you can create any number of objects from it by assigning variable to the class by referencing the class like a function

16.4.2.1.1. example:

16.4.2.1.2. object creation is called instantiation (i.e. it becomes an instance of the class)

16.5. Procedural approach suffers from issues when creating an object (e.g. a stack)

16.5.1. 1. variables referencing built-in types like lists can be accidentally altered by other code in ways not intended

16.5.1.1. solved by the class-object paradigm, which delivers encapsulation (objects cannot have their internal properties altered by external means)

16.5.2. 2. creating multiple versions of an object can often require copying code

16.5.2.1. solved by concept of instantiation (class defines all necessary properties and methods, defined just once, and having many copies at once as objects is easy)

16.5.3. 3. extending the functionality of an object can be fiddly and awkward to manage

16.5.3.1. solved by inheritance and ability to create sub classes

16.6. Python object constructor method

16.6.1. first method of a class should be: def __init__(self):

16.6.1.1. constructor methods require at least one parameter and first one must refer to the object being created - using "self" is a convention (i.e. not compulsory) but it is highly recommended to always follow that convention

16.6.2. constructor methods cannot return anything because they are designed to exclusively return a new object instance of the class

16.6.3. constructor methods cannot be explicitly invoked from the object or from its class (although invocation from one its super classes is allowed)

16.7. Python object encapsulation

16.7.1. to make object properties private, you must declare them in the class constructor method with a double underscore (__) prefix

16.7.1.1. examples:

16.7.1.1.1. class Stack: def __init__(self): self.stackList = [] stackObject = Stack() print(len(stackObject.stackList))

16.7.1.1.2. class Stack: def __init__(self): self.__stackList = [] stackObject = Stack() print(len(stackObject.__stackList))

16.8. Python object methods

16.8.1. When defining class methods, you must always define them with at least one parameter and the first parameter should be "self"

16.8.1.1. example:

16.8.1.1.1. class Stack: def __init__(self): self.__stackList = [] def push(self, val): self.__stackList.append(val) def pop(self): val = self.__stackList[-1] del self.__stackList[-1] return val

16.8.2. note: a method is a function defined inside a class but unlike functions, there is no such thing as a parameterless method (it is possible to invoke a function without passing an argument but you cannot define one without at least one parameter, the first of which will always be "self")

16.8.2.1. you should never attempt to explicitly pass an argument for self when invoking a function, as Python will do this automatically for you

16.8.3. self is used to get a reference to the (yet to be created) object and gain access to all the object/class variables/methods

16.8.4. object methods can be made hidden (private) just like variables by prefixing with __ and the same property name mangling occurs as for variables

16.9. Python object inheritance

16.9.1. Inheritance is achieved by defining a new class that takes the name of its super class as a parameter name. Furthermore, the constructor method should explicitly invoke the constructor method of its super class

16.9.1.1. example:

16.9.1.1.1. class AddingStack(Stack): def __init__(self): Stack.__init__(self) self.__sum = 0

16.9.2. overriding methods via inheritance involves defining the method in the sub class as a mix of invoking the super class method and adding new functionality

16.9.2.1. example:

16.9.2.1.1. class AddingStack(Stack): def __init__(self): Stack.__init__(self) self.__sum = 0 def push(self, val): self.__sum += val Stack.push(self, val)

16.10. Python object return hidden property value

16.10.1. the way to return a value for a hidden object property is to define a "getter" method for it with a return statement that uses dot notation with a self reference

16.10.1.1. example:

16.10.1.1.1. class AddingStack(Stack): def __init__(self): Stack.__init__(self) self.__sum = 0 def getSum(self): return self.__sum

16.11. Python instance variables

16.11.1. the idea of instance variables is that different objects of same class can have different properties that are entirely isolated from each other, and it is even possible to extend an object with new properties, post instantiation

16.11.1.1. example:

16.11.1.1.1. class ExampleClass: def __init__(self, val = 1): self.first = val def setSecond(self, val): self.second = val exampleObject1 = ExampleClass() exampleObject2 = ExampleClass(2) exampleObject2.setSecond(3) exampleObject3 = ExampleClass(4) exampleObject3.third = 5 print(exampleObject1.__dict__) print(exampleObject2.__dict__) print(exampleObject3.__dict__)

16.11.2. when using private variables (with the double underscore __ prefix), Python creates the instance variable names differently when those variables are created from inside class methods - it adds prefix of "_<class_name>" to the private variable name - but not when variable is created directly from outside

16.11.2.1. example:

16.11.2.1.1. class ExampleClass: def __init__(self, val = 1): self.__first = val def setSecond(self, val = 2): self.__second = val exampleObject1 = ExampleClass() exampleObject2 = ExampleClass(2) exampleObject2.setSecond(3) exampleObject3 = ExampleClass(4) exampleObject3.__third = 5 print(exampleObject1.__dict__) print(exampleObject2.__dict__) print(exampleObject3.__dict__)

16.11.2.2. this changing of a hidden property from __<property> to _<class>__<property> is known as property name mangling

16.12. Python object __dict__

16.12.1. Python creates a number of built in properties and methods for every new object and __dict__ is a dictionary property that holds names and values of all properties (variables) that the object is currently holding

16.13. Python class variables

16.13.1. Class variables are declared inside class definition, outside of methods, and they can be altered by methods, and key difference with instance variables is that they exist before any objects exist and keep a single value independently of all objects

16.13.1.1. example

16.13.1.1.1. class ExampleClass: counter = 0 def __init__(self, val = 1): self.__first = val ExampleClass.counter += 1 print(ExampleClass.counter) exampleObject1 = ExampleClass() exampleObject2 = ExampleClass(2) exampleObject3 = ExampleClass(4) print(exampleObject1.__dict__, exampleObject1.counter) print(exampleObject2.__dict__, exampleObject2.counter) print(exampleObject3.__dict__, exampleObject3.counter)

16.13.2. Class variables exhibit same behaviour as instance variables when defining them as "private" by using the __ prefix convention

16.13.3. Class variables are members of the class __dict__ property, which can be access via class_name.__dict__

16.14. Python hasattr() function

16.14.1. As Python takes a different attitude to many other languages to OOP, it allows objects of same class to have different properties, and to help make a safe check for property existence, the hasattr() function is provided

16.14.1.1. hasattr() requires two parameters: 1. name of class or object (unquoted) 2. name of property (quoted)

16.14.1.1.1. returns True or False

16.14.1.1.2. note: hasattr() will return True when 1st arg is object name and 2nd arg is class variable, but will return False when 1st arg is class name and 2nd arg is instance variable

16.15. Python class __name__ property

16.15.1. __name__ is a string property tied to class only, which returns the name of the class

16.15.1.1. example:

16.15.1.1.1. class Classy: pass print(Classy.__name__)

16.15.2. use type() function on object to return class and then return __name__ property from result

16.15.2.1. examples:

16.15.2.1.1. class Classy: pass obj = Classy() print(type(obj))

16.15.2.1.2. class Classy: pass obj = Classy() print(type(obj).__name__)

16.15.2.1.3. note: print(obj.__name__) will return error as __name__ does not exist in context of object

16.16. Python __module__ property

16.16.1. __module__ is a string property for classes and objects that returns the name of the module that defines the class

16.16.1.1. when the class definition is in the current file (as it would also be if running via interactive interpreter) the result of module is always "__main__"

16.16.1.2. when you use it on an object/class after first importing the module that defines the class, then you will get the proper external module name

16.16.1.3. example:

16.16.1.3.1. class Classy: pass print(Classy.__module__) obj = Classy() print(obj.__module__)

16.17. Python class __bases__ property

16.17.1. __bases__ is a tuple property built in for all classes, where the elements are superclasses

16.17.1.1. example:

16.17.1.1.1. class SuperOne: pass class SuperTwo: pass class Sub(SuperOne, SuperTwo): pass print('( ', end='') for x in Sub.__bases__: print(x.__name__, end=' ') print(')')

16.17.2. where a class has no superclass, it inherits from a built in Python class named object

16.17.2.1. example:

16.17.2.1.1. class SuperOne: pass print('( ', end='') for x in SuperOne.__bases__: print(x.__name__, end=' ') print(')')

16.18. Introspection

16.18.1. ability of a program to examine the type or properties of an object at runtime

16.18.1.1. Python essentially allows you to interrogate all meta data about objects and classes

16.18.2. Python issubclass() function

16.18.2.1. takes two arguments that must reference a class or object and returns True if the 2nd is a subclass of 1st

16.18.2.1.1. note: Python considers an object or class to be a subclass of itself, so that will result in True

16.18.2.1.2. example:

16.18.2.2. Python isinstance() function

16.18.2.2.1. similar to issubclass(), it takes first argument as an object reference and second as a class reference and returns True if object is an instance of the class or any of the class's subclasses

16.18.3. Python is operator

16.18.3.1. use the is operator to check if one variable holds a reference to the same object as another variable, returning True or False

16.18.3.2. note: this is different to ==, which can return True when comparing two objects of the same type with identical property values, whereas for the is operator to return True, the two variables must refer to a single, common object

16.18.4. Python super() function

16.18.4.1. The super() function can be used inside class definitions to gain access to the properties (variables and methods) of a superclass without having to explicitly name the superclass

16.18.4.1.1. example:

16.18.4.2. when using super() to access an inherited function, do not specify self as first parameter

16.19. Reflection

16.19.1. ability of a program to manipulate the values, properties and/or functions of an object at runtime

16.19.1.1. here is a simple program that demonstrates both introspection and reflection in Python - it creates an object, adds instance variables to it (reflection), queries attributes of the object using __dict__ and getattr() (introspection) and changes state using setattr() (reflection)

16.19.1.1.1. class MyClass: pass obj = MyClass() obj.a = 1 obj.b = 2 obj.i = 3 obj.ireal = 3.5 obj.integer = 4 obj.z = 5 def incIntsI(obj): for name in obj.__dict__.keys(): if name.startswith('i'): val = getattr(obj, name) if isinstance(val, int): setattr(obj, name, val + 1) print(obj.__dict__) incIntsI(obj) print(obj.__dict__)

16.20. Python __str__() method

16.20.1. __str__() is a built-in method inherited by everything in Python and its purpose is to describe an object in a string

16.20.1.1. it can be handy to override the __str__() function for any classes you create, enabling a more user friendly description of an object

16.20.1.1.1. example:

16.21. Python multiple inheritance

16.21.1. one type of multiple inheritance for a class (and all objects instantiated from it) is from all its superclasses

16.21.1.1. inheritance works bottom to top, which means if any properties or methods are overridden, the lowest in the super/sub class chain is inherited

16.21.1.1.1. example:

16.21.2. another type of multiple inheritance occurs when a class inherits from two or more unrelated superclasses

16.21.2.1. inheritance works left to right, based on the order of the parameters in the subclass definition

16.21.2.1.1. example:

16.21.3. beware using the super() function when dealing with multiple inheritance because the results will be ambiguous

16.21.4. although multiple inheritance is possible it should not be your first choice as it is riskier than single inheritance and violates a principle known as the single responsibility principle

16.21.4.1. consider using composition before going for multiple inheritance

16.21.4.2. note that Python will prevent any multiple inheritance that effectively forms a diamond

16.21.4.2.1. example:

16.22. Polymorphism

16.22.1. ability of subclassess to inherit from superclasses but change characteristics or behaviour

16.23. Composition

16.23.1. Composition is the process of composing an object using other different objects

16.23.1.1. example:

16.23.1.1.1. class Hello: firstWord = 'Hello' def __init__(self, nextWord): self.phrase = Hello.firstWord + ' ' + nextWord class World: def getWord(self): return 'world' myPhrase = Hello(World().getWord()).phrase print(myPhrase)

17. Python lambda functions

17.1. A lambda function is a function without a name (you can also call it an anonymous function)

17.2. The lambda function is a concept borrowed from mathematics, more specifically, from a part called the Lambda calculus, but these two phenomena are not the same

17.3. Mathematicians use the Lambda calculus in many formal systems connected with logic, recursion, or theorem provability. Programmers use the lambda function to simplify the code, to make it clearer and easier to understand.

17.4. declaration of the lambda function doesn't resemble a normal function declaration in any way

17.4.1. lambda parameters : expression

17.4.1.1. note that as with regular functions, parameters are optional and when there are two or more, these must be separated by commas

17.4.2. very simple example that shows you can actually name lambda functions, although this means they are no longer anonymous and in normal use you will see lambdas used anonymously

17.4.2.1. two = lambda : 2 sqr = lambda x : x * x pwr = lambda x, y : x ** y for a in range(-2, 3): print(sqr(a), end=" ") print(pwr(a, two()))

17.5. a lambda function can be substituted anywhere that a regular named function can

17.5.1. following two examples return same result, 1st one shows function as a named one called poly and 2nd example shows that poly can easily be replaced by an anonymous lambda function

17.5.1.1. def printfunction(args, fun): for x in args: print('f(', x,')=', fun(x), sep='') def poly(x): return 2 * x**2 - 4 * x + 2 printfunction([x for x in range(-2, 3)], poly)

17.5.1.2. def printfunction(args, fun): for x in args: print('f(', x,')=', fun(x), sep='') printfunction([x for x in range(-2, 3)], lambda x: 2 * x**2 - 4 * x + 2)

17.6. Python map() function

17.6.1. map() function is one that is a common use case for lambdas (as programmer's view it as more elegant coding)

17.6.2. syntax is:

17.6.2.1. map(fun, iter)

17.6.2.2. fun is a function name, which can be a named function, but can also be a lambda function

17.6.2.3. iter is an iterable such as a list, tuple or generator

17.6.2.4. there can be more than one iter argument passed

17.6.3. feeds iter (iterator) into fun (function) as series of arguments and returns a map object that holds results of repeated function calls, where the map object is itself iterable

17.6.3.1. example

17.6.3.1.1. firstNames = ["Ian","Favorita","Cristina","Busty","Amanda"] lastNames = ["Bradshaw","Barbarello","Barbarello-Bradshaw"] greetings = map(lambda f : "hello " + f, firstNames) for m in greetings: print(m) greetings = map(lambda f, l : "hello " + f + " " + l, firstNames, lastNames) for m in greetings: print(m)

17.6.4. list() function can be used to convert new map object into a list

17.6.4.1. example

17.6.4.1.1. firstNames = ["Ian","Favorita","Cristina"] print(list(map(lambda f : "hello " + f, firstNames)))

17.7. Python filter() function

17.7.1. filter() function is another one, similar to map(), that is often combined with lambda functions for more elegant syntax

17.7.2. syntax is same as map() except that it accepts only 2 arguments (i.e. you cannot specify multiple iterator arguments, like you can with map() )

17.7.3. function (1st argument) is fed values from its 2nd argument and must subject that argument to a True/False test, capturing the element from the 2nd argument whenever the function yields True

17.7.3.1. example

17.7.3.1.1. data = [0, 1, 2, 3, 4] filtered = list(filter(lambda x: x > 0 and x % 2 == 0, data)) print(data) print(filtered)

18. Python file processing

18.1. One of the most common issues in the developer's job is to process data stored in files

18.2. Different operating systems can treat the files in different ways. For example, Windows uses a different naming convention than the one adopted in Unix/Linux systems.

18.3. canonical file names

18.3.1. name which uniquely defines the location of the file regardless of its level in the directory tree

18.3.2. different between Windows and Linux

18.3.2.1. example

18.3.2.1.1. Windows

18.3.2.1.2. Linux/Unix

18.3.2.2. Windows uses drive letters, Linux does not

18.3.2.3. Root directory is \ in Windows but / in Linux, and sub-directories are denoted same way

18.3.2.4. Linux canonical file names are case sensitive, but not so for Windows

18.3.3. care is needed when specifying canonical file names on a Windows platform because the backslash \ acts as an escape character in Python string expressions

18.3.3.1. one option is to escape the backslash with \\

18.3.3.1.1. name = "C:\\dir\\file"

18.3.3.2. another option is to take advantage of an automated conversion provided by Python, which allows you to express Windows canonical file names using the forward slash

18.3.3.2.1. name = "C:/dir/file"

18.4. Python (like almost every other programming language) does not interact directly with files but does so via abstractions that are commonly referred to as handles or streams

18.4.1. Python provides a rich set of functions and methods that perform operations on streams, which affect the real files using mechanisms contained in the operating system kernel

18.4.2. A stream must be connected to a physical file in a process known as binding

18.4.2.1. when a stream is connected, this is called opening the file

18.4.2.2. when a stream is disconnected, this is called closing the file

18.4.2.3. in between opening and closing a file, the program is free to invoke functions/methods on the stream that will manipulate the file in some way

18.4.2.4. opening a file can fail for multiple reasons and it is important that your program is designed to handle such failures

18.5. File stream opening and closing

18.5.1. must declare open mode

18.5.1.1. read mode

18.5.1.1.1. a stream opened in this mode allows read operations only; trying to write to the stream will cause an exception

18.5.1.2. write mode

18.5.1.2.1. a stream opened in this mode allows write operations only; attempting to read the stream will cause an exception

18.5.1.3. update mode

18.5.1.3.1. a stream opened in this mode allows both writes and reads

18.5.2. attempting an operation not permitted for open mode will cause UnsupportedOperation exception, which inherits OSError and ValueError, and comes from the io module

18.5.3. think of a stream as behaving rather like a tape recorder

18.5.3.1. When you read something from a stream, a virtual head moves over the stream, reading data into memory

18.5.3.2. When you write something to the stream, the same head moves along the stream recording the data from the memory

18.5.3.3. current file position

18.5.3.3.1. a commonly used term, picture this as the current position of the tape recorder read/write head, except it is referring of course to the file stream

18.5.4. file streams are provided via the io module

18.5.4.1. most file streams will inherit from IOBase, which is superclass for the following 3 subclassses

18.5.4.1.1. TextIOBase

18.5.4.1.2. BufferedIOBase

18.5.4.1.3. RawIOBase

18.5.4.2. Python open() function

18.5.4.2.1. built in function that creates file stream object and attempts to connect stream to file (i.e. opening the file)

18.5.4.2.2. note: it is not possible to use constructors for IOBase or any of its subclasses to create file stream, you must use the built in open() function

18.5.4.2.3. has one mandatory parameter for the file name, and 7 other parameters that are all optional as they have default values

18.5.4.2.4. open mode is 2nd parameter

18.5.4.3. Python close() method

18.5.4.3.1. invoke this method on an object created via the open() function to destroy the file stream object, removing its connection to the file

18.5.4.3.2. when close() method invoked on stream object, the buffering (a.k.a. caching) mechanism that handles transfer of data from memory to physical device forces a flush of buffers

18.6. File stream types

18.6.1. text stream

18.6.1.1. text streams ones are structured in lines; that is, they contain typographical characters (letters, digits, punctuation, etc.) arranged in rows (lines), as seen with the naked eye when you look at the contents of the file in the editor

18.6.1.2. This file is written (or read) mostly character by character, or line by line

18.6.1.2.1. portability consideration

18.6.2. binary stream

18.6.2.1. binary streams don't contain text but a sequence of bytes of any value. This sequence can be, for example, an executable program, an image, an audio or a video clip, a database file, etc

18.6.2.2. Because these files don't contain lines, the reads and writes relate to portions of data of any size. Hence the data is read/written byte by byte, or block by block, where the size of the block usually ranges from one byte to an arbitrarily chosen value.

18.7. Python pre-opened streams

18.7.1. the general rule for streams is that they must be explicitly opened before they can be used, but there are 3 exceptions to this rule

18.7.1.1. the following 3 streams are pre-opened when every Python program starts, and they are defined within the sys module

18.7.1.1.1. sys.stdin

18.7.1.1.2. sys.stdout

18.7.1.1.3. sys.stderr

18.8. Python stream read() method

18.8.1. syntax is read(size)

18.8.1.1. size is integer representing number of bytes

18.8.1.2. size is optional, if omitted it defaults to -1 which tells Python to read in the entire file

18.8.1.3. remember that 1 byte is only assured of representing 1 character in a text file that holds characters that conform to the ASCII encoding set

18.8.1.4. if you specify size as a number that exceeds the total number of bytes in the file, it just behaves as if you specified -1 (i.e. reads in the whole file)

18.8.2. returns a string representing either the whole file or first x characters from file

18.8.3. example:

18.8.3.1. stream = open("./Test-Files/tzop.txt", "rt", encoding = "utf-8") print(stream.read())

18.8.3.1.1. opens file (tzop.txt), which creates a text stream in read mode for file with utf-8 encoding

18.8.3.1.2. reads full content of file and prints to the screen

18.8.4. warning, using read() method without any parameters can be dangerous - reading a terabyte-long file using this method may corrupt your OS

18.9. Python stream readline() method

18.9.1. syntax is readline(size)

18.9.1.1. as with read() method, size represents bytes, and if bytes would take the "virtual read head" beyond EOF (end of file) marker the method simply returns everything up to EOF

18.9.1.2. new line character is also returned

18.9.2. returns a string representing a single line from file

18.9.3. each invocation of readline() returns another line until EOF marker hit, after which it will return an empty string

18.9.4. example:

18.9.4.1. from os import strerror try: ccnt = lcnt = 0 s = open('./Test-Files/tzop.txt', 'rt') line = s.readline() while line != '': lcnt += 1 for ch in line: print(ch, end='') ccnt += 1 line = s.readline() s.close() print("\n\nCharacters in file:", ccnt) print("Lines in file: ", lcnt) except IOError as e: print("I/O error occurred:", strerr(e.errno))

18.9.4.1.1. open file "tzop.txt" in read mode using a text stream

18.9.4.1.2. read 1st line in, then enter while loop that repeatedly reads successive lines in until EOF (readline() returns ''), count number of lines, and for each line loop through every character, printing to screen and counting total number of characters

18.10. Python stream readlines() method

18.10.1. syntax is readlines(hintsize)

18.10.1.1. hintsize is optional, if omitted will return every line in file as element in list

18.10.1.2. if hintsize is used, it represents bytes and will be used as a guide as to when to stop reading lines - this will be a rounded up number of bytes and may represent an internal buffer size

18.10.2. returns a list where every element is a string representing a line from the file (including the \n end of line character)

18.10.3. example:

18.10.3.1. from os import strerror try: s = open("./Test-Files/test.txt","rt") lines = s.readlines() s.close() print(lines) except IOError as e: print("I/O error occurred:",strerror(e.errno))

18.10.3.1.1. note: test.txt is a 3 line text file and print(lines) returns a list that include 3 string elements, the first two of which conclude with the \n (new line) character

18.11. Python stream objects are generators

18.11.1. this provides an alternative means to process a file, line by line

18.11.2. the __next__() method of stream object yields the next line from file

18.11.3. close() method is automatically invoked when you iterate a stream object

18.11.4. example:

18.11.4.1. from os import strerror try: for line in open("./Test-Files/test.txt","rt"): print(line,end="") except IOError as e: print("I/O error occurred:", strerror(e.errno))

18.11.4.1.1. note how compact and elegant this code is when you need to process a text file line by line

18.11.4.1.2. we can take advantage of the stream object being a generator and hence an iterator, which avoids the need to invoke the readlines() method and the close() method

18.12. Python stream write() method

18.12.1. takes a single string argument that will be written to the underlying file via the stream

18.12.2. will raise exception if file stream has been opened in a mode not compatible with write operations

18.12.3. does not automatically add new line characters, so you must add \n at the end of lines if required

18.12.3.1. note the you use \n even in Python programs running on Windows hosts and these will be automatically converted to \r\n in the final file

18.12.3.1.1. do not specify \r\n in your Python code for line endings as this will turn into a combination of CR+LF+LF

18.12.4. example

18.12.4.1. from os import strerror try: ws = open('./Test-Files/test.txt','wt') ws.write("This is my first file that I created in Python\n") ws.write("What do you think?\n") ws.write("Pretty cool, huh?") ws.close() except IOError as e: print("I/O error occurred:",strerror(e.errno))

18.12.5. you can also invoke write method on stderr if you want to write directly to that - just remember that you do not have to explicitly open/close that stream

18.12.5.1. example

18.12.5.1.1. import sys sys.stderr.write("Error message")

18.13. Processing amorphous data

18.13.1. Amorphous data is data which have no specific shape or form - they are just a series of bytes

18.13.1.1. This doesn't mean that these bytes cannot have their own meaning, or cannot represent any useful object, e.g., bitmap graphics

18.13.2. Python requires specialized classes to handle amorphous data

18.13.2.1. bytearray is one such class for handling amorphous data

18.13.2.1.1. bytearray is builtin and available to invoke its constructor method - no need to import any modules to use it

18.13.2.1.2. bytearray() constructor takes 3 optional arguments: source, encoding and errors

18.13.2.1.3. bytearray is similar to a list in that it is mutable, and be iterated and have its elements read and/or updated

18.13.2.1.4. writing bytearray objects to file streams is done in normal way with stream .write method - you must open the stream in mode compatible for writing to a binary file

18.13.2.1.5. Python stream .readinto() method

18.13.2.1.6. you can also read an entire binary file into memory using the stream .read() method