1. Python random module
1.1. implements pseudo-random number generators
1.1.1. algorithms aren't random - they are deterministic and predictable
1.1.2. A random number generator takes a value called a seed, treats it as an input value, calculates a "random" number based on it (the method depends on a chosen algorithm) and produces a new seed value
1.1.2.1. The initial seed value determines the order in which the generated values will appear
1.1.2.2. if you set the seed value to a fixed value and make a sequence of calls to the random number generator, the "random" numbers produced by that post-seed sequence are always reproducible if you repeat with the same seed
1.1.2.3. using a number derived from the current system date and time is a commonly used source for a seed number because it always produces a different set of random numbers due to never repeating the seed
1.2. seed()
1.2.1. sets seed value
1.2.2. argument is optional, but if supplied it takes an integer or converts to an integer
1.2.2.1. note: even seed('hello') will work, as 'hello' string is converted to an integer
1.2.2.2. without argument, the current system datetime is converted to integer and used
1.2.3. you don't have to explicitly set seed before using one of the random number generator functions, and in this case the system datetime will automatically be used to default the seed at the time the module is imported
1.3. random()
1.3.1. returns next random float number between 0.0 and 1.0
1.4. randrange(end)
1.4.1. return random integer between 0 and end minus 1
1.4.1.1. e.g. randrange(5) returns random integer between 0 and 4
1.5. randrange(begin,end)
1.5.1. return random integer between begin and end minus 1
1.6. randrange(begin,end, step)
1.6.1. return random integer between begin and end minus 1 in steps of step
1.7. randint(left,right)
1.7.1. return random integer between left and right
1.7.1.1. e.g. randint(1,5) returns random integer between 1 and 5
1.8. choice(sequence)
1.8.1. return random element from a sequence, such as a list of numbers
1.8.1.1. if you use with a for loop and the list .remove method on each iteration, you can use this like a lottery draw
1.9. sample(sequence, elements_to_choose=1)
1.9.1. return a list of length elements_to_choose (which defaults to 1 if omitted) drawn in random order from a sequence, such as a list
1.9.2. elements_to_choose cannot exceed length of sequence, otherwise an exception is raised
2. Python platform module
2.1. think of your code executing at the top of a pyramid: 1. Code 2. Python runtime environment 3. OS 4. Hardware (device drivers, etc.)
2.1.1. opening a file, for example, is an instruction that goes from your code to the Python runtime environment, which handles the OS instruction, and the OS understands how to interact with the hardware for the required disk reads into memory, etc.
2.2. platform(alias = False, terse = False)
2.2.1. returns info about the platform that the Python runtime is hosted on
2.2.1.1. e.g. Windows-10
2.3. machine()
2.3.1. returns generic name of processor
2.3.1.1. e.g. AMD64
2.4. processor()
2.4.1. returns real processor name if possible
2.4.1.1. e.g. Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
2.5. system()
2.5.1. returns generic name of OS
2.5.1.1. e.g. Windows
2.6. version()
2.6.1. returns version of OS
2.6.1.1. e.g. 10.0.18362
2.7. python_implementation()
2.7.1. returns Python implementation
2.7.1.1. e.g. CPython
2.7.2. returns Python implementation
2.8. python_version_tuple()
2.8.1. returns major version, minor version and patch level as 3-element tuple
2.8.1.1. e.g. ('3', '8', '1')
3. Python Standard Module index
3.1. There are many modules, which collectively make up the Python universe, and pure Python is like a single galaxy within that universe
3.2. The idea is to find specific modules for what you need to do and then learn how to use them
4. Python creating a module
4.1. Observation based on experiment
4.1.1. module.py is empty file representing a module
4.1.2. main.py is file in same directory as module.py and includes a single line: import module.py
4.1.3. when you run main.py for first time, it produces some effects on the file system
4.1.3.1. a __pycache__ subdirectory is created
4.1.3.2. file is created inside __pycache__ subdirectory, named with following convention: <module_name>.<python_disribution>.xy.pyc, where x is major version no and y is minor version no
4.1.3.2.1. e.g. module.cpython-36.pyc
4.1.3.3. .pyc file contains semi-compiled code, optimised for execution by Python interpreter
4.1.3.3.1. makes module code faster to load and run next time
4.1.3.3.2. Python automatically tracks changes to source module and rebuilds .pyc file when required
4.2. Running import statement for module file automatically creates a variable labelled __name__
4.2.1. __name__ variable returns two different values depending on execution context
4.2.1.1. when code execution is inside module file itself, __name__ returns '__main__'
4.2.1.2. when code execution is outside module file (i.e. you are referencing it as <module_name>.__name__ having previously executed import <module_name>, it will return '<module_name>'
4.2.1.3. this can be used to check execution context and develop appropriate conditional logic based on that
4.2.1.3.1. For example, as modules are generally collections of functions, designed for import and not to be executed as a standalone file, you might add some logic based on __name__ to print some helpful message should someone decide to execute the module file directly
5. Python sys module
5.1. path variable
5.1.1. holds list of paths that are searched when running the import statement
5.1.1.1. Python supports reading zip files as directories for modules, which helps save a lot of disk space
5.1.2. appending to or inserting into sys.path list variable is how you can store usable modules in different sub-directories, distinct from program files that use them
6. Python exceptions
6.1. When code is syntactically correct but results in an error, two things happen: 1. Program execution is halted 2. An exception object is created
6.1.1. known as raising an exception
6.1.2. if code does not handle exception, then program will be forcibly terminated
6.1.3. Python interpreter returns name of exception in its error message when not handled
6.1.3.1. example:
6.1.3.1.1. ZeroDivisionError: division by zero
6.1.3.2. part before colon is name of exception
6.1.4. Exception handling for all risky code:
6.1.4.1. try: <try block> except exc1: <exception block for exc1> except exc2: <exception block for exc2> except: <catch all exception block>
6.2. Rules for try-except
6.2.1. the except branches are searched in the same order in which they appear in the code
6.2.2. you must not use more than one except branch with a certain exception name
6.2.3. the number of different except branches is arbitrary - the only condition is that if you use try, you must put at least one except (named or not) after it
6.2.4. the except keyword must not be used without a preceding try
6.2.5. if any of the except branches is executed, no other branches will be visited
6.2.6. if none of the specified except branches matches the raised exception, the exception remains unhandled
6.2.7. if an unnamed except branch exists (one without an exception name), it has to be specified as the last
6.2.8. it is also possible to extend a standard try-except block with an else branch, which if added MUST follow all except branches and will only be executed if no exception arises from the try block at the top
6.2.8.1. it is also possible to extend a try-catch block with a finally branch, which if added MUST be the last branch (i.e. after all except branches and the else branch, if the latter exists), and unlike else, the finally branch will always execute regardless of whether or not an exception arose from the try block
6.2.8.1.1. example
6.3. Python 3 defines 63 built-in exceptions, which form a hierarchy
6.3.1. Example: ZeroDivisionError
6.3.1.1. is a more specific exception of type ArithmeticError
6.3.1.1.1. ArithmeticError is more specific exception of type Exception
6.3.2. significance of hierarchy is that your try-except block can handle exceptions at any level from most specific to most general
6.3.2.1. example: the following two code fragments are semantically equivalent because ArithmeticError is a general form of the specific ZeroDivisionException that the code is triggering
6.3.2.1.1. try: y = 1 / 0 except ZeroDivisionError: print("Oooppsss...") print("THE END.")
6.3.2.1.2. try: y = 1 / 0 except ArithmeticError: print("Oooppsss...") print("THE END.")
6.3.2.2. Avoid adding exception handlers for more general exceptions before more specific ones in the same hierarchy - this will make the more specific exception handlers useless as their code is unreachable
6.3.3. you can also include multiple exceptions in single except block
6.3.3.1. exceptions must be comma separated and enclosed in brackets ( )
6.3.3.1.1. example
6.4. You can also manually trigger exceptions by using the raise keyword with the name of a built-in exception
6.4.1. e.g. raise ZeroDivisionError
6.4.2. it's also possible to use raise without naming an exception, but this is only valid from inside an except block
6.4.2.1. can be useful for distributing exception handling across your code
6.4.2.2. example:
6.4.2.2.1. def badFun(n): try: return n / 0 except: print("I did it again!") raise try: badFun(0) except ArithmeticError: print("I see!") print("THE END.")
6.5. You can use the assert statement as a fail-safe check, which will raise an exception of type AssertionError if the expression following the assert keyword does not resolve to True
6.5.1. example usage:
6.5.1.1. import math x = float(input("Enter a number: ")) assert x >= 0.0 x = math.sqrt(x) print(x)
6.5.1.1.1. if x >= 0.0 does not resolve to True it will raise AssertionError, otherwise it will do nothing
6.5.2. assert will also raise AssertionError exception for the following results: number equating to zero empty string None
6.6. Common exceptions in hierarchy:
6.6.1. BaseException
6.6.1.1. Exception
6.6.1.1.1. ArithmeticError
6.6.1.1.2. AssertionError
6.6.1.1.3. LookupError
6.6.1.1.4. MemoryError
6.6.1.1.5. StandardError
6.6.1.2. KeyboardInterrupt
6.6.1.2.1. concrete exception raised when the user uses a keyboard shortcut designed to terminate a program's execution (Ctrl-C in most OSs); if handling this exception doesn't lead to program termination, the program continues its execution
6.6.1.3. most general (abstract) of all Python exceptions - all other exceptions are included in this one; it can be said that the following two except branches are equivalent: except: and except BaseException:
6.7. Exceptions are classes and when an exception is raised via a try block, this will create an object (i.e. an instance of one of the classes in the class hierarchy that begins with BaseException)
6.7.1. the following simple example demonstrates how we can access information about a captured exception object - note the use of the "as" keyword followed by the alias (e in this example)
6.7.1.1. try: i = int("Hello!") except Exception as e: print(type(e).__name__) print(e.__str__()) i = None print("i =",i)
6.7.1.1.1. returns:
6.8. Python custom exceptions
6.8.1. you can define your own custom exception classes - this can be an specialised extension of a more specific exception class or if you want to create your own very particular hierarchy, you can use the high level Exception class as your top level superclass
6.8.1.1. example:
6.8.1.1.1. class PizzaError(Exception): def __init__(self, pizza = "unknown", message = ""): Exception.__init__(self, message) self.pizza = pizza def __str__(self): return "PizzaError" class TooMuchCheeseError(PizzaError): def __init__(self, pizza = "unknown", cheese = ">100", message = ""): PizzaError.__init__(self, pizza, message) self.cheese = cheese def __str__(self): return "TooMuchCheeseError"
7. Python string list sorting
7.1. Python sorted() function
7.1.1. takes single list argument and returns a new list with all elements sorted
7.2. Python list sort() method
7.2.1. sorts and modifies source list (i.e. does not return copy, but actually changes the subject list)
8. Python generator
8.1. a generator is a special type of function that produces multiple outputs and returns these encapsulated inside an iterable object
8.2. range() is an example of a generator
8.2.1. e.g. range(5) produces 5 values, 0 to 4 and returns an object that can be iterated by a for loop
8.3. a generator can also be a class that provides two methods: __iter__(), __next__()
8.3.1. __iter__() method returns the object and is invoked once
8.3.2. __next__() should return the next value (first, second, and so on) of the desired series - it will be invoked by the for/in statements in order to pass through the next iteration; if there are no more values to provide, the method should raise the StopIteration exception
8.3.3. example:
8.3.3.1. class Fib: def __init__(self, nn): print("__init__") self.__n = nn self.__i = 0 self.__p1 = self.__p2 = 1 def __iter__(self): print("__iter__") return self def __next__(self): print("__next__") self.__i += 1 if self.__i > self.__n: raise StopIteration if self.__i in [1, 2]: return 1 ret = self.__p1 + self.__p2 self.__p1, self.__p2 = self.__p2, ret return ret for i in Fib(10): print(i)
8.3.3.1.1. results show that constructor __init__() runs first, then __iter__(), then __next__() is called repeatedly, and the final time is the StopIteration exception that halts the iteration process but is gracefully handled
8.3.3.1.2. note that Fib uses recursion to call itself
8.3.3.1.3. the Fib object conforms to the iterator protocol - otherwise the for..in construct would raise an exception
8.4. Iterator protocol
8.4.1. way in which an object should behave to conform to the rules imposed by the context of the for and in statements
8.4.2. An object conforming to the iterator protocol is called an iterator
8.5. Python yield statement
8.5.1. The iterator protocol is rather inconvenient (as linked example shows, code is longer and harder to comprehend), which leads to the yield statement, which can be likened to a special form of the return statement
8.5.1.1. example:
8.5.1.1.1. def fun(n): for i in range(n): yield i for v in fun(5): print(v)
8.5.2. using yield instead of return converts a function to a generator, which yields a generator object that is iterable
8.5.3. as well as using a generator in a regular for loop, we can also use it in list comprehension
8.5.3.1. example
8.5.3.1.1. def powersOf2(n): pow = 1 for i in range(n): yield pow pow *= 2 t = [x for x in powersOf2(5)] print(t)
8.5.4. list() function can take a generator as its argument and convert it into a regular list
8.5.4.1. example
8.5.4.1.1. def powersOf2(n): pow = 1 for i in range(n): yield pow pow *= 2 t = list(powersOf2(3)) print(t)
8.5.5. we can use a generator with the in operator in place of a regular list
8.5.5.1. example
8.5.5.1.1. def powersOf2(n): pow = 1 for i in range(n): yield pow pow *= 2 for i in range(20): if i in powersOf2(4): print(i)
8.6. Python list comprehension to generator
8.6.1. in addition to a list comprehension using a generator (such as range() ), you can tweak any list comprehension expression so that it yields a generator rather than a list
8.6.1.1. the only change you have to make is to replace the square brackets [ ] of the list comprehension to regular paretheses ( )
8.6.1.1.1. example
9. Python closures
9.1. Closures provide an alternative for classes that would typically only be created with one method.
9.2. They avoid the use of global variables and provide a form of data hiding.
9.3. Following criteria must be met to create closure in Python:
9.3.1. 1. Must have a nested function
9.3.2. 2. Nested function must refer to value defined in enclosing function
9.3.3. 3. Enclosing function must return nested function
9.4. example
9.4.1. def makeclosure(par): loc = par def power(p): return p ** loc return power fsqr = makeclosure(2) fcub = makeclosure(3) for i in range(5): print(i, fsqr(i), fcub(i))
9.4.1.1. returns
9.4.1.1.1. 0 0 0 1 1 1 2 4 8 3 9 27 4 16 64
9.4.1.2. Note that power() function references variable loc, defined by makeclosure.
9.4.1.3. Note that makeclosure returns copy of power() function by using return statement and name of function WITHOUT parentheses.
9.4.1.4. Note that we can capture copies of the power() function with the value of loc "locked in" so to speak, assign these to variables fsqr and fcub respectively and then invoke fsqr and fcub as functions (separate copies of power() each with different fixed values for loc).
10. Python Modules
10.1. Decomposition
10.1.1. breaking down code into smaller self contained parts
10.2. Managing code size and complexity
10.3. Think of a module as a book, folders as shelves and folder collections as libraries, where each book's chapters consist of functions, variables, classes and objects
10.4. Python standard library
10.4.1. Modules and built-in functions that come included with a Python distribution
10.4.2. Includes modules written in C that provide access to file system
10.5. Python import module
10.5.1. use import keyword + name of module
10.5.2. you can import multiple modules in single import statement by comma separating module names
10.5.2.1. e.g. import math, sys
10.5.3. import statement can be anywhere in code but must come before first invocation
10.6. Python namespace
10.6.1. An analogy is a social group where everyone is known by a unique name, perhaps making use of nicknames to ensure unique identification
10.6.2. when you import a module, this is a source file that will have a bunch of associated names that will become known within your code, but by default they won't override any names in your code and must be accessed by prefixing the module name - e.g. math.pi (where math is a module and pi is a constant defined inside that module)
10.6.2.1. import <module>
10.6.2.1.1. all names in that module are accessible but qualifying with module name prefix is mandatory
10.6.2.2. from <module> import <name(s)>
10.6.2.2.1. all names imported are accessible without module name qualification
10.6.2.2.2. <name(s)> can be comma separated list
10.6.2.2.3. e.g. from math import sin, pi
10.6.2.2.4. overrides any pre-existing names, but equally names can be defined in your code after the import and those definitions then override
10.6.2.3. from <module> import *
10.6.2.3.1. Imports all entities from a module
10.6.2.3.2. Higher risk of name conflicts
10.6.2.3.3. Convenient but considered bad practice for regular code
10.6.2.4. import <module> as <alias>
10.6.2.4.1. Imports module and assigns alias, which you use in qualifying references to that module's entities
10.6.2.4.2. note that "as" is a keyword
10.6.2.4.3. after successful aliased import, the original module name cannot be used
10.6.2.5. from <module> import <name> as <alias>
10.6.2.5.1. Imports specific entity (name) with an alias
10.6.2.5.2. <name> as <alias> can repeat in single statement with comma separations
10.7. Python dir() function
10.7.1. Built in function you can use after import <module> to list alphabetically all the entities available from the imported module
10.7.2. example:
10.7.2.1. import math dir(math)
10.7.3. if you import module with an alias, then you must use alias with dir()
11. Python math module
11.1. sin(x)
11.1.1. sine of x
11.2. cos(x)
11.2.1. cosine of x
11.3. tan(x)
11.3.1. tangent of x
11.4. asin(x)
11.4.1. arcsine of x
11.5. acos(x)
11.5.1. arccosine of x
11.6. atan(x)
11.6.1. archtangent of x
11.7. pi
11.7.1. constant that approximates pi value
11.8. radians(x)
11.8.1. converts x from degrees to radians
11.9. degrees(x)
11.9.1. converts x from radians to degrees
11.10. e
11.10.1. constant that approximates Euler's number
11.11. exp(x)
11.11.1. e to the power of x
11.12. log(x)
11.12.1. natural logarithm of x
11.13. log(x, b)
11.13.1. logarithm of x to the power of b
11.14. log10(x)
11.14.1. decimal logarithm of x, more precise than log(x, 10)
11.15. log2(x)
11.15.1. binary logarithm of x, more precise than log(x, 2)
11.16. pow(x, y)
11.16.1. x to the power of y
11.16.2. note: this is a built-in function, not really part of math module, so no need to import math to use it
11.17. ceil(x)
11.17.1. ceiling ox x (smallest integer greater than or equal to x)
11.18. floor(x)
11.18.1. floor of x (largest integer less than or equal to x)
11.19. trunc(x)
11.19.1. value of x truncated to an integer
11.19.1.1. behaves like floor on positive numbers and ceil on negative numbers
11.20. factorial(x)
11.20.1. value of x!
11.20.1.1. x must be positive, otherwise exception raised
11.20.1.2. x must be unambiguously resolvable to a whole number, otherwise exception raised
11.20.1.2.1. ok
11.20.1.2.2. not ok
11.21. hypot(x, y)
11.21.1. returns length of hypotenuse for right-angled triangle with leg lengths of x and y
12. Python packages
12.1. Group together modules
13. Python module features
13.1. When you add variables in a module, there is no way in Python to keep that variable hidden or protected from unwanted changes by the module user.
13.1.1. Python module developers must trust their users not to mis-use the module variable.
13.1.2. There is a common convention to prefix "internal" variable names with an underscore "_" or double underscore "__". This is intended to communicate to the module user that its supposed to be an internal read-only variable.
13.2. The shabang or hashbang line that is often added to the top of a module file begins with "#!" is just a comment to Python but for Unix, Linux and MacOS it instructs the OS how to execute contents of the file.
13.2.1. Example is "#!/usr/bin/env python3", which would be common to see in python modules residing on Linux for example.
13.3. It is common practice to include a comment enclosed by triple quotes """ either side, which may well be a multi-line comment.
13.3.1. This explains the purpose of the module and is known as the doc-string.
13.3.1.1. Typically this will immediately follow the hashbang comment at the top of the module file.
14. Python packages
14.1. Packages are a collection of related modules organised into a hierarchical sub-directory collection in the host file system
14.2. A reference to a function in a module that is nested below the top level package directory is made using dot (.) notation to separate the sub-directory references
14.2.1. example: extra.good.best.tau.funT()
14.2.1.1. extra, good and best represent hierarchical sub-directories in the package
14.2.1.2. extra is the top level directory for the package
14.2.1.3. tau is a module (filename tau.py) located inside the best sub-directory
14.2.1.4. funT() is a function located inside the tau.py module
14.3. In order for Python to recognise that a particular collection of module files represents a package, initialisation is required
14.3.1. Package initialisation is achieved by placing a file with the following name in the top level directory for the package:
14.3.1.1. __init__.py
14.3.1.1.1. if you don't require any special initialisation for the package, this file can be empty but the file itself must exist
14.4. A common file structure for storing program files and packages is: packages programs
14.4.1. here is a common piece of code to have at the top of your program files given the aforementioned structure of parallel sub-directories named programs and packages
14.4.1.1. from sys import path path.append('..\\packages')
14.4.1.1.1. the double dot (..) steps back up 1 level in directory hierarchy (from programs), and the backslash is doubled because Python recognises \ as an escape character, so we must escape it for Python to treat as a sub-directory reference
15. Text Handling
15.1. Computers store characters as numbers
15.2. Common need to process character data across varied computer systems led to standardisation of character encoding systems
15.2.1. ASCII is one of the most popular, common standards, based on latin alphabet, allowing for 256 characters
15.2.1.1. Latin alphabet + numeric digits and various common whitespace (e.g. TAB, SPACE) and control (e.g. CR, LF) characters, plus a few commonly used symbols (e.g. $, !) are encoded within first 128 code points of ASCII (0 to 127)
15.2.1.2. ASCII leverages concept of code page for the upper 128 characters (128 to 255) to support needs for some other languages with similar alphabets
15.2.1.2.1. this means that single code point in 128 to 255 range in ASCII can return a different character depending on the code page that is being applied
15.2.2. ASCII is inadequate to support need for internationalization, which is a term that may be referred to as I18N (starts with "I" + 18 letters, ends with "N")
15.2.2.1. Code pages solved the I18N problem for a while but was recognised as imperfect, which led to the Unicode standard
15.2.2.1.1. Unicode assigns unique (unambiguous) characters (letters, hyphens, ideograms, etc.) to more than a million code points
15.2.2.1.2. First 128 characters of Unicode are identical to ASCII
15.2.2.1.3. first 256 Unicode code points are identical to the ISO/IEC 8859-1 code page (a code page designed for western European languages)
15.2.2.1.4. Unicode standard says nothing about how to code and store the characters in the memory and files. It only names all available characters and assigns them to planes (a group of characters of similar origin, application, or nature).
15.2.3. Each character (alphabetic, numeric, symbolic, whitespace or control) is represented in an encoding system by a code point, each of which has a unique number assigned to it by the encoding system
15.3. Strings in Python are immutable sequences
15.3.1. you can iterate them like lists and you can access individual characters via index references
15.3.1.1. examples
15.3.1.1.1. myString = 'hello world' for i in range(len(myString)): print(mystring[i], sep="", end="")
15.3.1.1.2. myString = 'hello world' for c in myString: print(c, sep="", end="")
15.3.2. slices work with strings too
15.3.2.1. alpha = "abdefg" print(alpha[1:3]) print(alpha[3:-2]) print(alpha[::2])
15.3.2.1.1. returns: bd e adf
15.3.3. you can use the in and not in operators with strings too, like lists
15.3.4. unlike lists, you cannot use del with an index reference to remove any part of a string, although you can use del on the whole string
15.3.4.1. it follows that unlike lists, strings do not have an append() or insert() method, so any attempt to use those with a variable holding a string will raise an exception
15.3.5. min() function works with strings and lists alike
15.3.5.1. example:
15.3.5.1.1. print(min("aAbByYzZ")) t = 'The Knights Who Say "Ni!"' print('[' + min(t) + ']') t = [0, 1, 2] print(min(t))
15.3.5.2. string argument cannot be an empty string, otherwise it will throw a ValueError exception
15.3.6. max() function works in opposite way to min()
15.3.7. index() method returns first index position of a substring passed as argument, but if substring not found it will raise ValueError exception
15.3.7.1. example
15.3.7.1.1. myString = 'Hello world' print(myString.index('w')) print(myString.index('Hell')) print (myString.index('World'))
15.3.8. list() function will convert a string to a list
15.3.8.1. example:
15.3.8.1.1. print(list("abcabc"))
15.3.9. count() method works the same for strings as for lists
15.3.9.1. example:
15.3.9.1.1. myString = "abcabc" myList = list(myString) print(myString.count('a')) print(myList.count('a'))
15.4. Multiline strings are specified using either 3 apostrophes ''' or 3 quotes """
15.4.1. examples
15.4.1.1. multiLine = '''Line #1 Line #2'''
15.4.1.2. multiLine = """Line #1 Line #2"""
15.4.2. note that this is illegal
15.4.2.1. multiLine = 'Line #1 Line #2'
15.4.3. note that len includes the whitespace characters
15.4.3.1. when you press enter in Python editor, you will get LF whitespace character added, which can also be denoted as \n
15.5. String data supports use of + and * operators, which is an example of overloading as they do not behave in same way as for arithmetic operations involving numbers
15.5.1. + performs string concatention
15.5.1.1. print('hello ' + 'world')
15.5.1.1.1. returns hello world
15.5.2. * performs string multiplication
15.5.2.1. print('a' * 3)
15.5.2.1.1. returns aaa
15.5.3. += and *= are both supported for string assignments
15.6. Python ord() function
15.6.1. takes 1 character string argument and returns the Unicode encoding number for the character
15.7. Python chr() function
15.7.1. opposite function of ord(), takes single integer argument and returns Unicode string character
15.8. Python string specific methods
15.8.1. capitalize() method
15.8.1.1. if character at index[0] of source string is a letter, capitalize it, and convert any other letters in string to lower case, with result being a new string
15.8.1.1.1. example:
15.8.2. center() method
15.8.2.1. one-parameter variant of the center() method makes a copy of the original string, trying to center it inside a field of a specified width
15.8.2.1.1. example:
15.8.2.1.2. if string length exceeds argument value, then copy of original string returned without any added spaces
15.8.2.2. two-parameter variant of center() makes use of the character from the second argument, instead of a space
15.8.2.2.1. example:
15.8.3. endswith() method
15.8.3.1. returns True if source string ends with substring passed as argument, else False
15.8.3.1.1. example:
15.8.3.2. startswith() is the mirror opposite of endswith(), returning True if source string starts with substring passed as argument, esle False
15.8.4. find() method
15.8.4.1. similar to index(), it looks for a substring and returns the index of first occurrence of this substring, but it's safer (returns -1 if substring not found rather than raising exception like index()) and works with strings only
15.8.4.1.1. example:
15.8.4.1.2. use 2-parameter variant to start search at some position beyond index 0
15.8.4.1.3. use 3-parameter variant to limit upper position of search
15.8.4.2. rfind() method
15.8.4.2.1. has 1, 2 and 3 parameter variants that work almost identically to find() but starts search from end of string and works back
15.8.5. isalnum() method
15.8.5.1. returns True if source string consists exclusively of numeric digits and/or alphabetic characters (letters), else False
15.8.5.1.1. example:
15.8.5.1.2. will return False if string contains any spaces
15.8.5.1.3. will return True for alphabets other than Western Latin
15.8.6. isalpha() method
15.8.6.1. more specialised than isalnum(), returning True only when all characters are alphabetic letters
15.8.7. isdigit() method
15.8.7.1. more specialised than isalnum(), returning True only when all characters are numeric digits
15.8.8. islower() method
15.8.8.1. more specialised than isalpha(), returning True only when all characters are lowercase alpha
15.8.9. isupper() method
15.8.9.1. opposite of islower(), returns True only when all characters are uppercase alpha
15.8.10. isspace() method
15.8.10.1. returns True when all characters are whitespace
15.8.10.1.1. example:
15.8.11. join() method
15.8.11.1. takes a single list of strings as an argument and uses the source string as a separator to combine all the list strings into a single new string
15.8.11.1.1. example:
15.8.11.1.2. if list argument does not hold exclusively string elements, it will raise a TypeError exception
15.8.12. lower() method
15.8.12.1. converts all uppercase letters in source string to lowercase and returns copy of transformed string
15.8.12.1.1. example:
15.8.12.2. swapcase() method transforms all lowercase letters to upper and all uppercase letter to lower, returning transformed string
15.8.12.3. title() method transforms first letter of every word to uppercase, and all other letters to lowercase
15.8.12.3.1. example:
15.8.12.4. upper() method does the mirror opposite of lower()
15.8.13. lstrip() method
15.8.13.1. with no argument, this removes all leading whitespace characters from source string and returns transformed copy
15.8.13.1.1. example:
15.8.13.2. with single string argument, it substitutes that string for the leading characters to be removed
15.8.13.2.1. example:
15.8.13.3. rstrip() method is same as lstrip(), with 0 and 1 parameter variants, but works from opposite end of string
15.8.13.3.1. actually, the substring argument works by examining rightmost part of source string for ANY combination of substring characters and strips all of them
15.8.13.4. strip() method combines lstrip() and rstrip() into one
15.8.13.4.1. examples:
15.8.14. replace() method
15.8.14.1. requires two substring parameters, searches for first substring in source string and replaces with second substring, and returns result as new string
15.8.14.1.1. example:
15.8.14.2. there is a 3 parameter variant, where the 3rd parameter is an integer limiting the number of replacments
15.8.14.2.1. example:
15.8.15. split() method
15.8.15.1. places all substrings found in source string into elements of a list that is returned
15.8.15.1.1. example:
15.8.15.2. assumes that whitespaces are substring delimiters
15.8.15.3. join() method performs opposite action of split(), where the source string would typically be a space or some other delimiter and the argument of split() is a list of strings
15.8.15.3.1. example:
15.9. Python string comparison operators
15.9.1. All the usual comparison operators that can be used with numbers can also be used with strings (== , != , > , >= , < , <=)
15.9.1.1. the main thing to remember when comparing strings is that the comparison is always based on the ordinal ASCII/Unicode value of each character
15.9.1.2. remember that uppercase letters all occupy lower ordinal code values in ASCII and Unicode
15.9.1.2.1. so... print("hello" > "Hello") returns: True
15.9.1.3. bear in mind that a longer string compared to a shorter string is always greater than the shorter one when the longer string holds identical characters at its beginning to the shorter one
15.9.1.3.1. so... print("alpha" < "alphabet") returns: True
15.9.1.4. remember that comparing numbers that are strings is done on same basis as alphabetic letter comparisons
15.9.1.4.1. so... print('10' == '010') returns: False
15.9.1.4.2. and... print('10' < '8') returns: True
15.9.1.5. comparing a string with a number is possible using == and != but its generally a bad idea to make any comparisons between strings and numbers
15.9.1.5.1. so... print('10' == 10) returns: False
15.9.1.5.2. if you try and use one of the other comparison operators, you'll get a TypeError exception
15.10. Converting strings to numbers and vice versa
15.10.1. str() function
15.10.1.1. always safe, can convert any numeric type to a string
15.10.2. int() function
15.10.2.1. converts string representation of integer to an integer, but if string does not represent integer, it will raise a ValueError exception
15.10.2.1.1. note: int will not convert a string that represents a float to an int, but will convert an actual float to int by rounding
15.10.3. float() function
15.10.3.1. converts string representation of number (float or int) to a float, but if string does not represent a number, it will raise a ValueError exception
16. Classes and Objects
16.1. Classes categorise by a grouping of characteristics
16.2. A class can have many sub classes and sub classes have super classes
16.2.1. Sub classes inherit from super classes
16.3. Objects are created from classes and automatically belong to a class hierarchy
16.3.1. Conceptually every object has a unique grouping of 3 attribute types: name (think "noun") properties (think "adjectives") actions (think "verbs")
16.4. Python classes
16.4.1. create class using class keyword followed by class name, colon and indented class definition (similar to how function is defined)
16.4.1.1. example:
16.4.1.1.1. class TheSimplestClass: pass
16.4.2. Python objects
16.4.2.1. once class is defined, you can create any number of objects from it by assigning variable to the class by referencing the class like a function
16.4.2.1.1. example:
16.4.2.1.2. object creation is called instantiation (i.e. it becomes an instance of the class)
16.5. Procedural approach suffers from issues when creating an object (e.g. a stack)
16.5.1. 1. variables referencing built-in types like lists can be accidentally altered by other code in ways not intended
16.5.1.1. solved by the class-object paradigm, which delivers encapsulation (objects cannot have their internal properties altered by external means)
16.5.2. 2. creating multiple versions of an object can often require copying code
16.5.2.1. solved by concept of instantiation (class defines all necessary properties and methods, defined just once, and having many copies at once as objects is easy)
16.5.3. 3. extending the functionality of an object can be fiddly and awkward to manage
16.5.3.1. solved by inheritance and ability to create sub classes
16.6. Python object constructor method
16.6.1. first method of a class should be: def __init__(self):
16.6.1.1. constructor methods require at least one parameter and first one must refer to the object being created - using "self" is a convention (i.e. not compulsory) but it is highly recommended to always follow that convention
16.6.2. constructor methods cannot return anything because they are designed to exclusively return a new object instance of the class
16.6.3. constructor methods cannot be explicitly invoked from the object or from its class (although invocation from one its super classes is allowed)
16.7. Python object encapsulation
16.7.1. to make object properties private, you must declare them in the class constructor method with a double underscore (__) prefix
16.7.1.1. examples:
16.7.1.1.1. class Stack: def __init__(self): self.stackList = [] stackObject = Stack() print(len(stackObject.stackList))
16.7.1.1.2. class Stack: def __init__(self): self.__stackList = [] stackObject = Stack() print(len(stackObject.__stackList))
16.8. Python object methods
16.8.1. When defining class methods, you must always define them with at least one parameter and the first parameter should be "self"
16.8.1.1. example:
16.8.1.1.1. class Stack: def __init__(self): self.__stackList = [] def push(self, val): self.__stackList.append(val) def pop(self): val = self.__stackList[-1] del self.__stackList[-1] return val
16.8.2. note: a method is a function defined inside a class but unlike functions, there is no such thing as a parameterless method (it is possible to invoke a function without passing an argument but you cannot define one without at least one parameter, the first of which will always be "self")
16.8.2.1. you should never attempt to explicitly pass an argument for self when invoking a function, as Python will do this automatically for you
16.8.3. self is used to get a reference to the (yet to be created) object and gain access to all the object/class variables/methods
16.8.4. object methods can be made hidden (private) just like variables by prefixing with __ and the same property name mangling occurs as for variables
16.9. Python object inheritance
16.9.1. Inheritance is achieved by defining a new class that takes the name of its super class as a parameter name. Furthermore, the constructor method should explicitly invoke the constructor method of its super class
16.9.1.1. example:
16.9.1.1.1. class AddingStack(Stack): def __init__(self): Stack.__init__(self) self.__sum = 0
16.9.2. overriding methods via inheritance involves defining the method in the sub class as a mix of invoking the super class method and adding new functionality
16.9.2.1. example:
16.9.2.1.1. class AddingStack(Stack): def __init__(self): Stack.__init__(self) self.__sum = 0 def push(self, val): self.__sum += val Stack.push(self, val)
16.10. Python object return hidden property value
16.10.1. the way to return a value for a hidden object property is to define a "getter" method for it with a return statement that uses dot notation with a self reference
16.10.1.1. example:
16.10.1.1.1. class AddingStack(Stack): def __init__(self): Stack.__init__(self) self.__sum = 0 def getSum(self): return self.__sum
16.11. Python instance variables
16.11.1. the idea of instance variables is that different objects of same class can have different properties that are entirely isolated from each other, and it is even possible to extend an object with new properties, post instantiation
16.11.1.1. example:
16.11.1.1.1. class ExampleClass: def __init__(self, val = 1): self.first = val def setSecond(self, val): self.second = val exampleObject1 = ExampleClass() exampleObject2 = ExampleClass(2) exampleObject2.setSecond(3) exampleObject3 = ExampleClass(4) exampleObject3.third = 5 print(exampleObject1.__dict__) print(exampleObject2.__dict__) print(exampleObject3.__dict__)
16.11.2. when using private variables (with the double underscore __ prefix), Python creates the instance variable names differently when those variables are created from inside class methods - it adds prefix of "_<class_name>" to the private variable name - but not when variable is created directly from outside
16.11.2.1. example:
16.11.2.1.1. class ExampleClass: def __init__(self, val = 1): self.__first = val def setSecond(self, val = 2): self.__second = val exampleObject1 = ExampleClass() exampleObject2 = ExampleClass(2) exampleObject2.setSecond(3) exampleObject3 = ExampleClass(4) exampleObject3.__third = 5 print(exampleObject1.__dict__) print(exampleObject2.__dict__) print(exampleObject3.__dict__)
16.11.2.2. this changing of a hidden property from __<property> to _<class>__<property> is known as property name mangling
16.12. Python object __dict__
16.12.1. Python creates a number of built in properties and methods for every new object and __dict__ is a dictionary property that holds names and values of all properties (variables) that the object is currently holding
16.13. Python class variables
16.13.1. Class variables are declared inside class definition, outside of methods, and they can be altered by methods, and key difference with instance variables is that they exist before any objects exist and keep a single value independently of all objects
16.13.1.1. example
16.13.1.1.1. class ExampleClass: counter = 0 def __init__(self, val = 1): self.__first = val ExampleClass.counter += 1 print(ExampleClass.counter) exampleObject1 = ExampleClass() exampleObject2 = ExampleClass(2) exampleObject3 = ExampleClass(4) print(exampleObject1.__dict__, exampleObject1.counter) print(exampleObject2.__dict__, exampleObject2.counter) print(exampleObject3.__dict__, exampleObject3.counter)
16.13.2. Class variables exhibit same behaviour as instance variables when defining them as "private" by using the __ prefix convention
16.13.3. Class variables are members of the class __dict__ property, which can be access via class_name.__dict__
16.14. Python hasattr() function
16.14.1. As Python takes a different attitude to many other languages to OOP, it allows objects of same class to have different properties, and to help make a safe check for property existence, the hasattr() function is provided
16.14.1.1. hasattr() requires two parameters: 1. name of class or object (unquoted) 2. name of property (quoted)
16.14.1.1.1. returns True or False
16.14.1.1.2. note: hasattr() will return True when 1st arg is object name and 2nd arg is class variable, but will return False when 1st arg is class name and 2nd arg is instance variable
16.15. Python class __name__ property
16.15.1. __name__ is a string property tied to class only, which returns the name of the class
16.15.1.1. example:
16.15.1.1.1. class Classy: pass print(Classy.__name__)
16.15.2. use type() function on object to return class and then return __name__ property from result
16.15.2.1. examples:
16.15.2.1.1. class Classy: pass obj = Classy() print(type(obj))
16.15.2.1.2. class Classy: pass obj = Classy() print(type(obj).__name__)
16.15.2.1.3. note: print(obj.__name__) will return error as __name__ does not exist in context of object
16.16. Python __module__ property
16.16.1. __module__ is a string property for classes and objects that returns the name of the module that defines the class
16.16.1.1. when the class definition is in the current file (as it would also be if running via interactive interpreter) the result of module is always "__main__"
16.16.1.2. when you use it on an object/class after first importing the module that defines the class, then you will get the proper external module name
16.16.1.3. example:
16.16.1.3.1. class Classy: pass print(Classy.__module__) obj = Classy() print(obj.__module__)
16.17. Python class __bases__ property
16.17.1. __bases__ is a tuple property built in for all classes, where the elements are superclasses
16.17.1.1. example:
16.17.1.1.1. class SuperOne: pass class SuperTwo: pass class Sub(SuperOne, SuperTwo): pass print('( ', end='') for x in Sub.__bases__: print(x.__name__, end=' ') print(')')
16.17.2. where a class has no superclass, it inherits from a built in Python class named object
16.17.2.1. example:
16.17.2.1.1. class SuperOne: pass print('( ', end='') for x in SuperOne.__bases__: print(x.__name__, end=' ') print(')')
16.18. Introspection
16.18.1. ability of a program to examine the type or properties of an object at runtime
16.18.1.1. Python essentially allows you to interrogate all meta data about objects and classes
16.18.2. Python issubclass() function
16.18.2.1. takes two arguments that must reference a class or object and returns True if the 2nd is a subclass of 1st
16.18.2.1.1. note: Python considers an object or class to be a subclass of itself, so that will result in True
16.18.2.1.2. example:
16.18.2.2. Python isinstance() function
16.18.2.2.1. similar to issubclass(), it takes first argument as an object reference and second as a class reference and returns True if object is an instance of the class or any of the class's subclasses
16.18.3. Python is operator
16.18.3.1. use the is operator to check if one variable holds a reference to the same object as another variable, returning True or False
16.18.3.2. note: this is different to ==, which can return True when comparing two objects of the same type with identical property values, whereas for the is operator to return True, the two variables must refer to a single, common object
16.18.4. Python super() function
16.18.4.1. The super() function can be used inside class definitions to gain access to the properties (variables and methods) of a superclass without having to explicitly name the superclass
16.18.4.1.1. example:
16.18.4.2. when using super() to access an inherited function, do not specify self as first parameter
16.19. Reflection
16.19.1. ability of a program to manipulate the values, properties and/or functions of an object at runtime
16.19.1.1. here is a simple program that demonstrates both introspection and reflection in Python - it creates an object, adds instance variables to it (reflection), queries attributes of the object using __dict__ and getattr() (introspection) and changes state using setattr() (reflection)
16.19.1.1.1. class MyClass: pass obj = MyClass() obj.a = 1 obj.b = 2 obj.i = 3 obj.ireal = 3.5 obj.integer = 4 obj.z = 5 def incIntsI(obj): for name in obj.__dict__.keys(): if name.startswith('i'): val = getattr(obj, name) if isinstance(val, int): setattr(obj, name, val + 1) print(obj.__dict__) incIntsI(obj) print(obj.__dict__)
16.20. Python __str__() method
16.20.1. __str__() is a built-in method inherited by everything in Python and its purpose is to describe an object in a string
16.20.1.1. it can be handy to override the __str__() function for any classes you create, enabling a more user friendly description of an object
16.20.1.1.1. example:
16.21. Python multiple inheritance
16.21.1. one type of multiple inheritance for a class (and all objects instantiated from it) is from all its superclasses
16.21.1.1. inheritance works bottom to top, which means if any properties or methods are overridden, the lowest in the super/sub class chain is inherited
16.21.1.1.1. example:
16.21.2. another type of multiple inheritance occurs when a class inherits from two or more unrelated superclasses
16.21.2.1. inheritance works left to right, based on the order of the parameters in the subclass definition
16.21.2.1.1. example:
16.21.3. beware using the super() function when dealing with multiple inheritance because the results will be ambiguous
16.21.4. although multiple inheritance is possible it should not be your first choice as it is riskier than single inheritance and violates a principle known as the single responsibility principle
16.21.4.1. consider using composition before going for multiple inheritance
16.21.4.2. note that Python will prevent any multiple inheritance that effectively forms a diamond
16.21.4.2.1. example:
16.22. Polymorphism
16.22.1. ability of subclassess to inherit from superclasses but change characteristics or behaviour
16.23. Composition
16.23.1. Composition is the process of composing an object using other different objects
16.23.1.1. example:
16.23.1.1.1. class Hello: firstWord = 'Hello' def __init__(self, nextWord): self.phrase = Hello.firstWord + ' ' + nextWord class World: def getWord(self): return 'world' myPhrase = Hello(World().getWord()).phrase print(myPhrase)
17. Python lambda functions
17.1. A lambda function is a function without a name (you can also call it an anonymous function)
17.2. The lambda function is a concept borrowed from mathematics, more specifically, from a part called the Lambda calculus, but these two phenomena are not the same
17.3. Mathematicians use the Lambda calculus in many formal systems connected with logic, recursion, or theorem provability. Programmers use the lambda function to simplify the code, to make it clearer and easier to understand.
17.4. declaration of the lambda function doesn't resemble a normal function declaration in any way
17.4.1. lambda parameters : expression
17.4.1.1. note that as with regular functions, parameters are optional and when there are two or more, these must be separated by commas
17.4.2. very simple example that shows you can actually name lambda functions, although this means they are no longer anonymous and in normal use you will see lambdas used anonymously
17.4.2.1. two = lambda : 2 sqr = lambda x : x * x pwr = lambda x, y : x ** y for a in range(-2, 3): print(sqr(a), end=" ") print(pwr(a, two()))
17.5. a lambda function can be substituted anywhere that a regular named function can
17.5.1. following two examples return same result, 1st one shows function as a named one called poly and 2nd example shows that poly can easily be replaced by an anonymous lambda function
17.5.1.1. def printfunction(args, fun): for x in args: print('f(', x,')=', fun(x), sep='') def poly(x): return 2 * x**2 - 4 * x + 2 printfunction([x for x in range(-2, 3)], poly)
17.5.1.2. def printfunction(args, fun): for x in args: print('f(', x,')=', fun(x), sep='') printfunction([x for x in range(-2, 3)], lambda x: 2 * x**2 - 4 * x + 2)
17.6. Python map() function
17.6.1. map() function is one that is a common use case for lambdas (as programmer's view it as more elegant coding)
17.6.2. syntax is:
17.6.2.1. map(fun, iter)
17.6.2.2. fun is a function name, which can be a named function, but can also be a lambda function
17.6.2.3. iter is an iterable such as a list, tuple or generator
17.6.2.4. there can be more than one iter argument passed
17.6.3. feeds iter (iterator) into fun (function) as series of arguments and returns a map object that holds results of repeated function calls, where the map object is itself iterable
17.6.3.1. example
17.6.3.1.1. firstNames = ["Ian","Favorita","Cristina","Busty","Amanda"] lastNames = ["Bradshaw","Barbarello","Barbarello-Bradshaw"] greetings = map(lambda f : "hello " + f, firstNames) for m in greetings: print(m) greetings = map(lambda f, l : "hello " + f + " " + l, firstNames, lastNames) for m in greetings: print(m)
17.6.4. list() function can be used to convert new map object into a list
17.6.4.1. example
17.6.4.1.1. firstNames = ["Ian","Favorita","Cristina"] print(list(map(lambda f : "hello " + f, firstNames)))
17.7. Python filter() function
17.7.1. filter() function is another one, similar to map(), that is often combined with lambda functions for more elegant syntax
17.7.2. syntax is same as map() except that it accepts only 2 arguments (i.e. you cannot specify multiple iterator arguments, like you can with map() )
17.7.3. function (1st argument) is fed values from its 2nd argument and must subject that argument to a True/False test, capturing the element from the 2nd argument whenever the function yields True
17.7.3.1. example
17.7.3.1.1. data = [0, 1, 2, 3, 4] filtered = list(filter(lambda x: x > 0 and x % 2 == 0, data)) print(data) print(filtered)
18. Python file processing
18.1. One of the most common issues in the developer's job is to process data stored in files
18.2. Different operating systems can treat the files in different ways. For example, Windows uses a different naming convention than the one adopted in Unix/Linux systems.
18.3. canonical file names
18.3.1. name which uniquely defines the location of the file regardless of its level in the directory tree
18.3.2. different between Windows and Linux
18.3.2.1. example
18.3.2.1.1. Windows
18.3.2.1.2. Linux/Unix
18.3.2.2. Windows uses drive letters, Linux does not
18.3.2.3. Root directory is \ in Windows but / in Linux, and sub-directories are denoted same way
18.3.2.4. Linux canonical file names are case sensitive, but not so for Windows
18.3.3. care is needed when specifying canonical file names on a Windows platform because the backslash \ acts as an escape character in Python string expressions
18.3.3.1. one option is to escape the backslash with \\
18.3.3.1.1. name = "C:\\dir\\file"
18.3.3.2. another option is to take advantage of an automated conversion provided by Python, which allows you to express Windows canonical file names using the forward slash
18.3.3.2.1. name = "C:/dir/file"
18.4. Python (like almost every other programming language) does not interact directly with files but does so via abstractions that are commonly referred to as handles or streams
18.4.1. Python provides a rich set of functions and methods that perform operations on streams, which affect the real files using mechanisms contained in the operating system kernel
18.4.2. A stream must be connected to a physical file in a process known as binding
18.4.2.1. when a stream is connected, this is called opening the file
18.4.2.2. when a stream is disconnected, this is called closing the file
18.4.2.3. in between opening and closing a file, the program is free to invoke functions/methods on the stream that will manipulate the file in some way
18.4.2.4. opening a file can fail for multiple reasons and it is important that your program is designed to handle such failures
18.5. File stream opening and closing
18.5.1. must declare open mode
18.5.1.1. read mode
18.5.1.1.1. a stream opened in this mode allows read operations only; trying to write to the stream will cause an exception
18.5.1.2. write mode
18.5.1.2.1. a stream opened in this mode allows write operations only; attempting to read the stream will cause an exception
18.5.1.3. update mode
18.5.1.3.1. a stream opened in this mode allows both writes and reads
18.5.2. attempting an operation not permitted for open mode will cause UnsupportedOperation exception, which inherits OSError and ValueError, and comes from the io module
18.5.3. think of a stream as behaving rather like a tape recorder
18.5.3.1. When you read something from a stream, a virtual head moves over the stream, reading data into memory
18.5.3.2. When you write something to the stream, the same head moves along the stream recording the data from the memory
18.5.3.3. current file position
18.5.3.3.1. a commonly used term, picture this as the current position of the tape recorder read/write head, except it is referring of course to the file stream
18.5.4. file streams are provided via the io module
18.5.4.1. most file streams will inherit from IOBase, which is superclass for the following 3 subclassses
18.5.4.1.1. TextIOBase
18.5.4.1.2. BufferedIOBase
18.5.4.1.3. RawIOBase
18.5.4.2. Python open() function
18.5.4.2.1. built in function that creates file stream object and attempts to connect stream to file (i.e. opening the file)
18.5.4.2.2. note: it is not possible to use constructors for IOBase or any of its subclasses to create file stream, you must use the built in open() function
18.5.4.2.3. has one mandatory parameter for the file name, and 7 other parameters that are all optional as they have default values
18.5.4.2.4. open mode is 2nd parameter
18.5.4.3. Python close() method
18.5.4.3.1. invoke this method on an object created via the open() function to destroy the file stream object, removing its connection to the file
18.5.4.3.2. when close() method invoked on stream object, the buffering (a.k.a. caching) mechanism that handles transfer of data from memory to physical device forces a flush of buffers
18.6. File stream types
18.6.1. text stream
18.6.1.1. text streams ones are structured in lines; that is, they contain typographical characters (letters, digits, punctuation, etc.) arranged in rows (lines), as seen with the naked eye when you look at the contents of the file in the editor
18.6.1.2. This file is written (or read) mostly character by character, or line by line
18.6.1.2.1. portability consideration
18.6.2. binary stream
18.6.2.1. binary streams don't contain text but a sequence of bytes of any value. This sequence can be, for example, an executable program, an image, an audio or a video clip, a database file, etc
18.6.2.2. Because these files don't contain lines, the reads and writes relate to portions of data of any size. Hence the data is read/written byte by byte, or block by block, where the size of the block usually ranges from one byte to an arbitrarily chosen value.
18.7. Python pre-opened streams
18.7.1. the general rule for streams is that they must be explicitly opened before they can be used, but there are 3 exceptions to this rule
18.7.1.1. the following 3 streams are pre-opened when every Python program starts, and they are defined within the sys module
18.7.1.1.1. sys.stdin
18.7.1.1.2. sys.stdout
18.7.1.1.3. sys.stderr
18.8. Python stream read() method
18.8.1. syntax is read(size)
18.8.1.1. size is integer representing number of bytes
18.8.1.2. size is optional, if omitted it defaults to -1 which tells Python to read in the entire file
18.8.1.3. remember that 1 byte is only assured of representing 1 character in a text file that holds characters that conform to the ASCII encoding set
18.8.1.4. if you specify size as a number that exceeds the total number of bytes in the file, it just behaves as if you specified -1 (i.e. reads in the whole file)
18.8.2. returns a string representing either the whole file or first x characters from file
18.8.3. example:
18.8.3.1. stream = open("./Test-Files/tzop.txt", "rt", encoding = "utf-8") print(stream.read())
18.8.3.1.1. opens file (tzop.txt), which creates a text stream in read mode for file with utf-8 encoding
18.8.3.1.2. reads full content of file and prints to the screen
18.8.4. warning, using read() method without any parameters can be dangerous - reading a terabyte-long file using this method may corrupt your OS
18.9. Python stream readline() method
18.9.1. syntax is readline(size)
18.9.1.1. as with read() method, size represents bytes, and if bytes would take the "virtual read head" beyond EOF (end of file) marker the method simply returns everything up to EOF
18.9.1.2. new line character is also returned
18.9.2. returns a string representing a single line from file
18.9.3. each invocation of readline() returns another line until EOF marker hit, after which it will return an empty string
18.9.4. example:
18.9.4.1. from os import strerror try: ccnt = lcnt = 0 s = open('./Test-Files/tzop.txt', 'rt') line = s.readline() while line != '': lcnt += 1 for ch in line: print(ch, end='') ccnt += 1 line = s.readline() s.close() print("\n\nCharacters in file:", ccnt) print("Lines in file: ", lcnt) except IOError as e: print("I/O error occurred:", strerr(e.errno))
18.9.4.1.1. open file "tzop.txt" in read mode using a text stream
18.9.4.1.2. read 1st line in, then enter while loop that repeatedly reads successive lines in until EOF (readline() returns ''), count number of lines, and for each line loop through every character, printing to screen and counting total number of characters
18.10. Python stream readlines() method
18.10.1. syntax is readlines(hintsize)
18.10.1.1. hintsize is optional, if omitted will return every line in file as element in list
18.10.1.2. if hintsize is used, it represents bytes and will be used as a guide as to when to stop reading lines - this will be a rounded up number of bytes and may represent an internal buffer size
18.10.2. returns a list where every element is a string representing a line from the file (including the \n end of line character)
18.10.3. example:
18.10.3.1. from os import strerror try: s = open("./Test-Files/test.txt","rt") lines = s.readlines() s.close() print(lines) except IOError as e: print("I/O error occurred:",strerror(e.errno))
18.10.3.1.1. note: test.txt is a 3 line text file and print(lines) returns a list that include 3 string elements, the first two of which conclude with the \n (new line) character
18.11. Python stream objects are generators
18.11.1. this provides an alternative means to process a file, line by line
18.11.2. the __next__() method of stream object yields the next line from file
18.11.3. close() method is automatically invoked when you iterate a stream object
18.11.4. example:
18.11.4.1. from os import strerror try: for line in open("./Test-Files/test.txt","rt"): print(line,end="") except IOError as e: print("I/O error occurred:", strerror(e.errno))
18.11.4.1.1. note how compact and elegant this code is when you need to process a text file line by line
18.11.4.1.2. we can take advantage of the stream object being a generator and hence an iterator, which avoids the need to invoke the readlines() method and the close() method
18.12. Python stream write() method
18.12.1. takes a single string argument that will be written to the underlying file via the stream
18.12.2. will raise exception if file stream has been opened in a mode not compatible with write operations
18.12.3. does not automatically add new line characters, so you must add \n at the end of lines if required
18.12.3.1. note the you use \n even in Python programs running on Windows hosts and these will be automatically converted to \r\n in the final file
18.12.3.1.1. do not specify \r\n in your Python code for line endings as this will turn into a combination of CR+LF+LF
18.12.4. example
18.12.4.1. from os import strerror try: ws = open('./Test-Files/test.txt','wt') ws.write("This is my first file that I created in Python\n") ws.write("What do you think?\n") ws.write("Pretty cool, huh?") ws.close() except IOError as e: print("I/O error occurred:",strerror(e.errno))
18.12.5. you can also invoke write method on stderr if you want to write directly to that - just remember that you do not have to explicitly open/close that stream
18.12.5.1. example
18.12.5.1.1. import sys sys.stderr.write("Error message")
18.13. Processing amorphous data
18.13.1. Amorphous data is data which have no specific shape or form - they are just a series of bytes
18.13.1.1. This doesn't mean that these bytes cannot have their own meaning, or cannot represent any useful object, e.g., bitmap graphics
18.13.2. Python requires specialized classes to handle amorphous data
18.13.2.1. bytearray is one such class for handling amorphous data
18.13.2.1.1. bytearray is builtin and available to invoke its constructor method - no need to import any modules to use it
18.13.2.1.2. bytearray() constructor takes 3 optional arguments: source, encoding and errors
18.13.2.1.3. bytearray is similar to a list in that it is mutable, and be iterated and have its elements read and/or updated
18.13.2.1.4. writing bytearray objects to file streams is done in normal way with stream .write method - you must open the stream in mode compatible for writing to a binary file
18.13.2.1.5. Python stream .readinto() method
18.13.2.1.6. you can also read an entire binary file into memory using the stream .read() method