Intro
Last time Python is getting more and more popular in BigData word.From wiki:
Python is an interpreted high-level programming language for general-purpose programming. Created by Guido van Rossum and first released in 1991, Python has a design philosophy that emphasizes code readability, and a syntax that allows programmers to express concepts in fewer lines of code,[26][27] notably using significant whitespace. It provides constructs that enable clear programming on both small and large scales.[28]
Python features a dynamic type system and automatic memory management. It supports multiple programming paradigms, including object-oriented, imperative, functional and procedural, and has a large and comprehensive standard library.[29]
Python interpreters are available for many operating systems. CPython, the reference implementation of Python, is open sourcesoftware[30] and has a community-based development model, as do nearly all of its variant implementations. CPython is managed by the non-profit Python Software Foundation.
From official documentation:
Python is powerful... and fast;
plays well with others;
runs everywhere;
is friendly & easy to learn;
is Open.
Installation
Installation process may vary depending on operational system: it can be installation on windows or just execution of "apt-get" in linux, but anyway, on official site you can find any information you need.
Python shell
Easiest way to play with python - command shell, which can be open by running "python" command:demien$ python
Python 2.7.13 (default, Nov 24 2017, 17:33:09)
[GCC 6.3.0 20170516] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
Now we can output standard "hello world" greeting message:
>>> print("hello world")
hello world
Variables can be created by simple "=" operator:
>>> name = "Huan Sebastyan"
>>> print("hello, {0} !!!".format(name))
hello, Huan Sebastyan !!!
PY Files
For long programs, python shell is not an option: it's better to put program content into file(or files) with extension ".py" and run it by executing: python myprogram.py. In code examples below I'm writing the code into file test.py and executing it by running: python test.py.
The main issue with Python files are the "blocks" of code : Python doesn't have {} or begin end operators to define scope(beginning and ending) of block, function or class. For this purpose Python use spaces:
operator1
operator2
def myFunction:
function operator1
function operator2
operator 3
Conditions in Python
Conditions are written without any bracers and of course for block inside condition we should use spaces to define begin and end:
name = raw_input("What is your name? ")
greeting =""
if name == "Huan Sebastyan":
greeting = "Buenos dias"
else:
greeting = "Hello"
print("{0}, {1} !!!".format(greeting, name))
execution:
demien$ python test.py
What is your name? Joe
Hello, Joe !!!
demien$ python test.py
What is your name? Huan Sebastyan
Buenos dias, Huan Sebastyan !!!
Loops
There several types of loops in python: while and for
While loop:
value=""
while value!="end":
value = raw_input("Enter 'end' to quit or anything else to continue: ")
print("you entered:{0}".format(value))
print("you did it!")
execution:
demien$ python test.py
Enter 'end' to quit or anything else to continue: hello
you entered:hello
Enter 'end' to quit or anything else to continue: world
you entered:world
Enter 'end' to quit or anything else to continue: end
you entered:end
you did it!
For loop:
count = int(raw_input("enter iteration count: "))
for i in range(1, count+1):
s = ""
for j in range(1, i+1):
s+="*"
print(s)
execution:
demien$ python test.py
enter iteration count: 5
*
**
***
****
*****
Functions
Definition begins with keyword "def" , bracers used for parameters:
x = 10
y = 20
def test(param):
global x
y = 20
x=x+1
y=y+1
print("from function context: param={0}, x={1}, y={2}".format(param, x, y))
return "BYE!"
result = test("HELLO!")
print("from global context: result={0}, x={1}, y={2}".format(result, x, y))
Execution:
demien$ python test.py
from function context: param=HELLO!, x=11, y=21
from global context: result=BYE!, x=11, y=20
Modules
For big programs it's impossible to keep all code in one file. Python provides concept MODULE for this:
Let's create a simple module in subfolder "tools":
tools/simple.py:
def sayHi(name):
print("Hi, {0}".format(name))
__version__ = "0.0.1"
If we want to use it, beside module itself we also need an empty file __init__.py in the same folder(in subfolder "tools"):
demien$ ls -la tools/*.py
-rw-r--r-- 1 demien demien 0 mar 11 14:49 tools/__init__.py
-rw-r--r-- 1 demien demien 78 mar 11 14:46 tools/simple.py
Now we can use this module from our main program:
import tools.simple as simple
simple.sayHi("Joe")
Execution:
demien$ python test.py
Hi, Joe
Sometimes we may need to understand if the code is running by "direct" execution, or by importing it as module. For this we can check condition: if __name__ == "__main__":
- if it returns true, the code is being executed in "direct" mode. Our current module by direct execution produce no output. Let's update our module to produce some output, but only in case of "direct" execution:
def sayHi(name):
print("Hi, {0}".format(name))
__version__ = "0.0.1"
if __name__ == "__main__":
sayHi("Huan Sebastyan")
So, now we can run our module in a direct way:
demien$ python tools/simple.py
Hi, Huan Sebastyan
And now let's use it as module:
demien$ python test.py
Hi, Joe
- output remained the same.
Dir function
This function is showing "content" of variables and methods defined in module:
>>> import tools.simple
>>> dir(tools.simple)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', '__version__', 'sayHi']
Collection classes
there several collection classes in Python: lists, dictionary, set, tuples
Lists
It's a mutable structures for storing data in arrays.
colors = ["red", "black", "white"]
print "there are ",len(colors)," colors in my list:"
for color in colors:
print(color)
colors.append("blue")
colors.append("green")
print "few more there added, now it's ",len(colors)," of them"
colors.sort()
print("sorted:", colors)
del colors[0]
del colors[0]
print("first 2 were deleted:", colors)
Execution:
demien$ python test.py
there are 3 colors in my list:
red
black
white
few more there added, now it's 5 of them
('sorted:', ['black', 'blue', 'green', 'red', 'white'])
('first 2 were deleted:', ['green', 'red', 'white'])
Tuples
Similar to lists but they are immutable
answer = ("yes", "no")
print answer[0]
print answer[1]
execution:
demien$ python test.py
yes
no
Dictionary
It's structure like Map or Associated array.
user = {
"name" : "Joe",
"surname" : "Black",
"address" : {
"country" : "USA",
"city" : "Houston"
}
}
print user["name"]
print user["address"]
print user["address"]["city"]
print "full list of pairs[key,value] in user dictionary:"
for key, value in user.items():
print "key=", key, " value=", value
Execution:
demien$ python test.py
Joe
{'country': 'USA', 'city': 'Houston'}
Houston
full list of pairs[key,value] in user dictionary:
key= surname value= Black
key= name value= Joe
key= address value= {'country': 'USA', 'city': 'Houston'}
Sequence operations.
Structures like lists, tuples and strings have list of common "sequence" operations.
colors = ["red", "black", "white", "blue", "gray", "green", "orange"]
colors.sort()
print colors
print('color 2 is', colors[2])
print('color -2 is', colors[-2])
print('colors 1 to 3 is', colors[1:3])
print('colors 2 to end is', colors[2:])
print('colors 1 to -1 is', colors[1:-1])
print('colors start to end is', colors[:])
Execution:
demien$ python test.py
['black', 'blue', 'gray', 'green', 'orange', 'red', 'white']
('color 2 is', 'gray')
('color -2 is', 'red')
('colors 1 to 3 is', ['blue', 'gray'])
('colors 2 to end is', ['gray', 'green', 'orange', 'red', 'white'])
('colors 1 to -1 is', ['blue', 'gray', 'green', 'orange', 'red'])
('colors start to end is', ['black', 'blue', 'gray', 'green', 'orange', 'red', 'white'])
Set
Another collection structure is Set. On sets we can apply some math logic, like AND, OR, XOR:
colors1 = set(["red", "black", "white"])
colors2 = set(["white", "blue", "gray"])
print(colors1)
print(colors2)
print("& : ", colors1 & colors2)
print("| : ", colors1 | colors2)
print("^ : ", colors1 ^ colors2)
Execution:
demien$ python test.py
set(['white', 'black', 'red'])
set(['blue', 'gray', 'white'])
('& : ', set(['white']))
('| : ', set(['blue', 'gray', 'black', 'white', 'red']))
('^ : ', set(['blue', 'gray', 'black', 'red']))
Classes
Main issues with OOP in Python are:
- classes are being created without "new" keyword: userJoe = User("Joe", "Black")
- constructor has name: __init__
- variables defined in class definition are class(not object) variables and should be accessed by className.variableName
- "self" stands for "this" variable, and used for definition of object variables: self.name = name
- methods which are using object variables have to explicitly define self as first input parameter: def sayHi(self)
- to define subclass, superclass name has to be passed as a "parameter" for subclass name: class AdminUser(User)
- to call superclass method (even constructor) format should be SuperClassName.SuperClassMethod: User.__init__(self, name, surname)
Example:
class User:
userCount = 0
def __init__(self, name, surname):
self.name = name
self.surname = surname
User.userCount+=1
print("User #{0} was created!".format(User.userCount))
def sayHi(self):
print("Hi, I'm {0} {1}".format(self.name, self.surname) )
class AdminUser(User):
def __init__(self, name, surname, role):
User.__init__(self, name, surname)
self.role = role
def sayHi(self):
User.sayHi(self)
print(" and I'm the {0} !!!".format(self.role))
userJoe = User("Joe", "Black")
userJoe.sayHi()
userHuan = User("Huan", "Seastyan")
userHuan.sayHi()
userAdmin = AdminUser("Super", "Admin", "boss")
userAdmin.sayHi()
Execution:
demien$ python test.py
User #1 was created!
Hi, I'm Joe Black
User #2 was created!
Hi, I'm Huan Seastyan
User #3 was created!
Hi, I'm Super Admin
and I'm the boss !!!
Files
For working with files all we need is "file()" operation which takes parameters :
1. file name
2. mode: read or write
Let's create simple file in.txt with 2 lines of text:
it's a test file
just as example
Now let's create a simple program which will convert this file to uppercase and write it into "out.txt" file:
inFile = file("in.txt", "r")
outFile = file("out.txt", "w")
eof = False
while eof == False :
line = inFile.readline()
if len(line)==0:
eof = True
else:
outFile.write(line.upper())
inFile.close()
outFile.close()
Execution result:
out.txt:
IT'S A TEST FILE
JUST AS EXAMPLE
Pickle
Module pickele (which should be imported) provides ability to save the object into file in "serialized" format. And, of course, later we can read it and deserialize back to original object.
import pickle
class User:
userCount = 0
def __init__(self, name, surname):
self.name = name
self.surname = surname
User.userCount+=1
print("User #{0} was created!".format(User.userCount))
def sayHi(self):
print("Hi, I'm {0} {1}".format(self.name, self.surname) )
userJoe = User("Joe", "Black")
fout = open("joe.bak", "wb")
pickle.dump(userJoe, fout)
fout.close()
del(userJoe)
fin = open("joe.bak", "rb")
restored = pickle.load(fin)
print(restored)
restored.sayHi()
fin.close()
Execution:
demien$ python test.py
User #1 was created!
<__main__.User instance at 0x7f05c0296ef0>
Hi, I'm Joe Black
Exceptions
Python has similar to other languages system of error handling with TRY, EXCEPT (which stands for CATCH) and FINALLY. On next example we are handling keyboard input exceptions such as pressing Ctrl+C during input and raising our own exception if input length is less than expected:
class ShortInputException(Exception):
"""A user-defined exception class."""
def __init__(self, length, atleast):
Exception.__init__(self)
self.length = length
self.atleast = atleast
try:
text = raw_input("Enter something ....")
if len(text) < 3 :
raise ShortInputException(len(text), 3)
except EOFError:
print("Why did you do an EOF on me?")
except KeyboardInterrupt:
print("You cancelled the operation.")
except ShortInputException as ex:
print(("ShortInputException: The input was " +
"{0} long, expected at least {1}")
.format(ex.length, ex.atleast))
else:
print("You entered {}".format(text))
finally:
print("done")
Execution:
demien$ python test.py
Enter something ....12345
You entered 12345
done
demien$ python test.py
Enter something ....12
ShortInputException: The input was 2 long, expected at least 3
done
demien$ python test.py
Enter something ....^CYou cancelled the operation.
done
Try with resource: WITH
If we are opening something in TRY block, very often we have to close it in a FINALLY block (try with resource). To make this "automatically" we can use WITH construction: opened resource will be closed automatically:
with open("in.txt") as fin:
for line in fin:
print(line)
The end
As for me, python has a lot of common with javascript: dynamic typing, inheritance, but more focused on "back-end" development of scripting. It's very simple but powerful. Now a lot of big data frameworks are providing python api, so it's better to be familiar with this language.