Sunday, March 18, 2018

Python: getting started


Intro

Last time Python is getting more and more popular in BigData word.
From wiki:
Python is an interpreted high-level programming language for general-purpose programming. Created by Guido van Rossum and first released in 1991, Python has a design philosophy that emphasizes code readability, and a syntax that allows programmers to express concepts in fewer lines of code,[26][27] notably using significant whitespace. It provides constructs that enable clear programming on both small and large scales.[28]
Python features a dynamic type system and automatic memory management. It supports multiple programming paradigms, including object-orientedimperativefunctional and procedural, and has a large and comprehensive standard library.[29]
Python interpreters are available for many operating systemsCPython, the reference implementation of Python, is open sourcesoftware[30] and has a community-based development model, as do nearly all of its variant implementations. CPython is managed by the non-profit Python Software Foundation.

From official documentation:
Python is powerful... and fast; 
plays well with others; 
runs everywhere; 
is friendly & easy to learn; 
is Open.


Installation

Installation process may vary depending on operational system: it can be installation on windows or just execution of "apt-get" in linux, but anyway, on official site you can find any information you need. 

Python shell

Easiest way to play with python - command shell, which can be open by running "python" command:

demien$ python
Python 2.7.13 (default, Nov 24 2017, 17:33:09)
[GCC 6.3.0 20170516] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>

Now we can output standard "hello world" greeting message: 
>>> print("hello world")
hello world


Variables can be created by simple "=" operator: 
>>> name = "Huan Sebastyan"
>>> print("hello, {0} !!!".format(name))
hello, Huan Sebastyan !!!


PY Files

For long programs, python shell is not an option: it's better to put program content into file(or files) with extension ".py"  and run it by executing: python myprogram.py.  In code examples below I'm writing the code into file test.py  and executing it by running: python test.py

The main issue with Python files are the "blocks" of code : Python doesn't have {} or begin end operators to define scope(beginning and ending) of block, function or class. For this purpose Python use spaces: 

operator1
operator2 
def myFunction:
    function operator1
    function operator2
operator 3


Conditions in Python

Conditions are written without any bracers and of course for block inside condition we should use spaces to define begin and end:  

name = raw_input("What is your name? ")
greeting =""
if name == "Huan Sebastyan":
greeting = "Buenos dias"
else:
greeting = "Hello"
print("{0}, {1} !!!".format(greeting, name))

execution:

demien$ python test.py
What is your name? Joe
Hello, Joe !!!

demien$ python test.py
What is your name? Huan Sebastyan
Buenos dias, Huan Sebastyan !!!

Loops

There several types of loops in python: while and for

While loop:


value=""
while value!="end":
value = raw_input("Enter 'end' to quit or anything else to continue: ")
print("you entered:{0}".format(value))
print("you did it!")


execution: 
demien$ python test.py
Enter 'end' to quit or anything else to continue: hello
you entered:hello
Enter 'end' to quit or anything else to continue: world
you entered:world
Enter 'end' to quit or anything else to continue: end
you entered:end
you did it!

For loop:

count = int(raw_input("enter iteration count: "))
for i in range(1, count+1):
s = ""
for j in range(1, i+1):
s+="*"
print(s)

execution:
demien$ python test.py
enter iteration count: 5
*
**
***
****
*****


Functions

Definition begins with keyword "def" , bracers used for parameters: 


x = 10
y = 20

def test(param):
global x
y = 20
x=x+1
y=y+1
print("from function context: param={0}, x={1}, y={2}".format(param, x, y))
return "BYE!"

result = test("HELLO!")
print("from global context: result={0}, x={1}, y={2}".format(result, x, y))

Execution:
demien$ python test.py
from function context: param=HELLO!, x=11, y=21
from global context: result=BYE!, x=11, y=20


Modules

For big programs it's impossible to keep all code in one file. Python provides concept MODULE for this: 
Let's create a simple module in subfolder "tools":

tools/simple.py: 

def sayHi(name):
print("Hi, {0}".format(name))
__version__ = "0.0.1"

If we want to use it, beside module itself we also need an empty file __init__.py in the same folder(in subfolder "tools"):
demien$ ls -la tools/*.py
-rw-r--r-- 1 demien demien  0 mar 11 14:49 tools/__init__.py
-rw-r--r-- 1 demien demien 78 mar 11 14:46 tools/simple.py

Now we can use this module from our main program: 

import tools.simple as simple

simple.sayHi("Joe")


Execution: 
demien$ python test.py
Hi, Joe


Sometimes we may need to understand if the code is running by "direct" execution, or by importing it as module. For this we can check condition: if __name__ == "__main__":
- if it returns true, the code is being executed in "direct" mode. Our current module by direct execution produce no output. Let's update our module to produce some output, but only in case of "direct" execution: 

def sayHi(name):
print("Hi, {0}".format(name))
__version__ = "0.0.1"
if __name__ == "__main__":
sayHi("Huan Sebastyan")


So, now we can run our module in a direct way: 

demien$ python tools/simple.py
Hi, Huan Sebastyan

And now let's use it as module: 

demien$ python test.py
Hi, Joe

- output remained the same. 


Dir function

This function is showing "content" of variables and methods defined in module: 

>>> import tools.simple
>>> dir(tools.simple)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', '__version__', 'sayHi']


Collection classes

there several collection classes in Python: lists, dictionary, set, tuples

Lists

It's a mutable structures for storing data in arrays.

colors = ["red", "black", "white"]
print "there are ",len(colors)," colors in my list:"
for color in colors:
print(color)

colors.append("blue")
colors.append("green")
print "few more there added, now it's ",len(colors)," of them"

colors.sort()
print("sorted:", colors)

del colors[0]
del colors[0]
print("first 2 were deleted:", colors)

Execution: 
demien$ python test.py
there are  3  colors in my list:
red
black
white
few more there added, now it's  5  of them
('sorted:', ['black', 'blue', 'green', 'red', 'white'])
('first 2 were deleted:', ['green', 'red', 'white'])

Tuples

Similar to lists  but they are immutable

answer = ("yes", "no")
print answer[0]
print answer[1]
execution:
demien$ python test.py
yes
no


Dictionary

It's structure like Map or Associated array. 

user = {
"name" : "Joe",
"surname" : "Black",
"address" : {
"country" : "USA",
"city" : "Houston"
}
}

print user["name"]
print user["address"]
print user["address"]["city"]

print "full list of pairs[key,value] in user dictionary:"
for key, value in user.items():
print "key=", key, " value=", value


Execution:
demien$ python test.py
Joe
{'country': 'USA', 'city': 'Houston'}
Houston
full list of pairs[key,value] in user dictionary:
key= surname  value= Black
key= name  value= Joe
key= address  value= {'country': 'USA', 'city': 'Houston'}


Sequence operations. 

Structures like lists, tuples and strings have list of common "sequence" operations.

colors = ["red", "black", "white", "blue", "gray", "green", "orange"]
colors.sort()
print colors

print('color 2 is', colors[2])
print('color -2 is', colors[-2])

print('colors 1 to 3 is', colors[1:3])
print('colors 2 to end is', colors[2:])
print('colors 1 to -1 is', colors[1:-1])
print('colors start to end is', colors[:])

Execution:
demien$ python test.py
['black', 'blue', 'gray', 'green', 'orange', 'red', 'white']
('color 2 is', 'gray')
('color -2 is', 'red')
('colors 1 to 3 is', ['blue', 'gray'])
('colors 2 to end is', ['gray', 'green', 'orange', 'red', 'white'])
('colors 1 to -1 is', ['blue', 'gray', 'green', 'orange', 'red'])
('colors start to end is', ['black', 'blue', 'gray', 'green', 'orange', 'red', 'white'])


Set

Another collection structure is Set. On sets we can apply some math logic, like AND, OR, XOR: 

colors1 = set(["red", "black", "white"])
colors2 = set(["white", "blue", "gray"])
print(colors1)
print(colors2)

print("& : ", colors1 & colors2)
print("| : ", colors1 | colors2)
print("^ : ", colors1 ^ colors2)
Execution: 
demien$ python test.py
set(['white', 'black', 'red'])
set(['blue', 'gray', 'white'])
('& : ', set(['white']))
('| : ', set(['blue', 'gray', 'black', 'white', 'red']))
('^ : ', set(['blue', 'gray', 'black', 'red']))



Classes

Main issues with  OOP in Python are:
- classes are being created without "new" keyword:  userJoe = User("Joe", "Black")
 - constructor has name: __init__
 - variables defined in class definition are class(not object) variables and should be accessed by className.variableName
 - "self" stands for "this" variable, and used for definition of object variables: self.name = name
-  methods which are using object variables have to explicitly define self as first input parameter: def sayHi(self)
- to define subclass, superclass name has to be passed as a "parameter" for subclass name: class AdminUser(User)
- to call superclass method (even constructor) format should be SuperClassName.SuperClassMethod: User.__init__(self, name, surname)


Example: 

class User:
userCount = 0

def __init__(self, name, surname):
self.name = name
self.surname = surname
User.userCount+=1
print("User #{0} was created!".format(User.userCount))

def sayHi(self):
print("Hi, I'm {0} {1}".format(self.name, self.surname) )

class AdminUser(User):
def __init__(self, name, surname, role):
User.__init__(self, name, surname)
self.role = role

def sayHi(self):
User.sayHi(self)
print(" and I'm the {0} !!!".format(self.role))

userJoe = User("Joe", "Black")
userJoe.sayHi()

userHuan = User("Huan", "Seastyan")
userHuan.sayHi()

userAdmin = AdminUser("Super", "Admin", "boss")
userAdmin.sayHi()
Execution: 
demien$ python test.py
User #1 was created!
Hi, I'm Joe Black
User #2 was created!
Hi, I'm Huan Seastyan
User #3 was created!
Hi, I'm Super Admin
  and I'm the boss !!!

Files

For working with files all we need is "file()" operation which takes parameters : 
1. file name
2. mode: read or write 

Let's create simple file in.txt with 2 lines of text: 
it's a test file
just as example

Now let's create a simple program which will convert this file to uppercase and write it into "out.txt" file: 

inFile = file("in.txt", "r")
outFile = file("out.txt", "w")

eof = False
while eof == False :
line = inFile.readline()
if len(line)==0:
eof = True
else:
outFile.write(line.upper())

inFile.close()
outFile.close()


Execution result
out.txt:
IT'S A TEST FILE
JUST AS EXAMPLE

Pickle

Module pickele (which should be imported) provides ability to save the object into file in "serialized" format. And, of course, later we can read it and deserialize back to original object. 

import pickle

class User:
userCount = 0

def __init__(self, name, surname):
self.name = name
self.surname = surname
User.userCount+=1
print("User #{0} was created!".format(User.userCount))

def sayHi(self):
print("Hi, I'm {0} {1}".format(self.name, self.surname) )

userJoe = User("Joe", "Black")

fout = open("joe.bak", "wb")
pickle.dump(userJoe, fout)
fout.close()
del(userJoe)

fin = open("joe.bak", "rb")
restored = pickle.load(fin)
print(restored)
restored.sayHi()
fin.close()

Execution: 
demien$ python test.py
User #1 was created!
<__main__.User instance at 0x7f05c0296ef0>
Hi, I'm Joe Black

Exceptions

Python has similar to other languages system of error handling with TRY, EXCEPT (which stands for CATCH) and FINALLY. On next example we are handling keyboard input  exceptions such as pressing Ctrl+C during input and raising our own exception if input length is less than expected: 

class ShortInputException(Exception):
"""A user-defined exception class."""

def __init__(self, length, atleast):
Exception.__init__(self)
self.length = length
self.atleast = atleast


try:
text = raw_input("Enter something ....")
if len(text) < 3 :
raise ShortInputException(len(text), 3)

except EOFError:
print("Why did you do an EOF on me?")

except KeyboardInterrupt:
print("You cancelled the operation.")

except ShortInputException as ex:
print(("ShortInputException: The input was " +
"{0} long, expected at least {1}")
.format(ex.length, ex.atleast))

else:
print("You entered {}".format(text))

finally:
print("done")


Execution
demien$ python test.py
Enter something ....12345
You entered 12345
done

demien$ python test.py
Enter something ....12
ShortInputException: The input was 2 long, expected at least 3
done

demien$ python test.py
Enter something ....^CYou cancelled the operation.
done


Try with resource: WITH

If we are opening something in TRY block, very often we have to close it in a FINALLY block (try with resource). To make this "automatically" we can use WITH construction: opened resource will be closed automatically: 

with open("in.txt") as fin:
for line in fin:
print(line)



The end

As for me, python has a lot of common with javascript: dynamic typing, inheritance, but more focused on "back-end" development of scripting. It's very simple but powerful.  Now a lot of big data frameworks are providing python api, so it's better to be familiar with this language.