Python Basic Operation

Basic Operation

  • 安装某一个包, terminal里面打 pip install name
  • 自定义函数
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
def soton_fun1(x1,x2):
return x1 *x2

#可变参数
def soton_fun1(*args):
sum = 0
for i in args:
return sum = sum + i

#关键字参数,允许输入一个或者多个含参数名的参数,会强制转化为字典格式。
def student(name,**kargs):
print(name,,kargs)
student("xiaoming") #只输入了必选参数
student("xiaoming",sex= 'male') #输入了可变参数sex

#匿名函数
#map函数接受两个参数,一个是函数,一个是iterable的序列函数,map会将传入的函数作用在每一个元素上,并把结果作为iterator返回。

a = [1,2,3,4,5,6,7,8]
m1 = map(lambda s : s**2,a)

# m1返回的是一个迭代器,需要用for循环或者list函数来取值,且只能一次
list(m1)
for value in m1:
print(m1)

#reduce接受一个函数和一个序列,reduce把函数结果与下一个元素做累计计算
x = [1,2,3,4,5,6,7,8,9]
r1 = reduce(lambda a,b: a+b,x)
#将x里面的元素详解得出结果

#filter接受一个函数与一个序列,filter把函数作用在每一个元素上,返回True或者False, 保留True的值
a = [1,2,3,4,6,7]
f1 = filter(lambda s : s> 3,a]
#保留大于3的值
  • 内置函数
1
2
3
4
5
6
7
8
9
10
#知道具体某一个column的位置
bdir = {}
for i in enumerate(df.columns):
bdir[i[1]] = i[0]
bdir['column_name'] #将会返回具体的位置的数值

#zip将两边拉起来
a = [1,2,3,4]
b = ['a', 'b', 'c', 'd']
list(zip(a,b))

Tuple

Create Tuple

1
2
3
4
5
6
7
tup = 5, 4, 6
tup = (4, 5, 6), (7, 8) #nested tuple
tuple([4, 6, 7]) #convert sequence to tuple
tuple('string') # convert iterator to tuple

tup = (['foo', [1, 2], True])
tup[1].append(3)
  • Tuple is unchangable, and the objects stored in tuple can be dfferent.
  • If the object stored in tuple is mutable, then we can modify this object.

unpacking tuples

1
2
a, b = 1, 2
b, a = a, b
  • Using the feature of unpacking tuples,we can swap variable names easily

advanced tuple unpacking

1
2
values = 1, 2, 3, 4, 5
a, b, *rest = values
  • the returns of a, b is (1, 2)—tuple, and the return of rest is [3, 4, 5]—list

python format function

1
2
3
4
5
6
seq = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
for a, b, c in seq:
print("a={0}, b={1}, c={2}".format(a, b, c))

for a, b, c in seq:
print("{0}, {1}, {2}".format(a, c, b))
  • See Book P53
  • seq is a nested tuple
  • The numbers in curly braces are the orders/positions of the parameters in the format function.(这里大括号里面的指定的是后面format函数里面参数的位置)

List

Basic ideal

  • Lists are modifiable and we can use [ ] or list function to define.

Adding and removing

1
2
3
4
5
6
7
8
9
10
11
12
a_list.append(xx) 
# add xx elements at the end of the list
a_list.insert(1, xx)
# insert xx element at index 1 (very expensive)
a_list.pop(2)
# delete the element at index 2
a_list.remove("loo")
# remove the first "loo" element

'xx' in a_list
# check if xx is in the a_list
'xx' not in a_list

Concatenating and combining lists

1
2
3
4
[1, 2, 3] + [4, 5, 6] 
# add up
a_list.extend([4, 5, 6])
# add list [4, 5, 6] and a_list

Sorting

1
2
3
4
5
6
7
8
9
10
11
a_list.sort()

a_list.sort(key=len)
# sort, basing on the length of elements

import bisect
bisect.bisect(a_list, 2)
# find the location where element 2 should be inseted to keep it sorted

bisect.insort(a_list, 2)
# insert element 2 to the a_list without affect the sort

Slicing

1
2
3
4
5
6
7
8
9
10
a_list[1:5]
# slice the first to forth elements of a_list. (Include the start one, but not the last one.)
a_list[:5]
a_list[-6:-2]

a_list[2:3] = [6, 3]
# change the second element of a_list by [6, 3], length + 1

a_list[::3]
# take step (Get the first element and then by step)

Built-in sequence functions

Enumerate

1
2
3
4
list(enumerate(a_list))
# it gives the sequence of the dict and elements
for i, element in enumerate(a_list):
print(i, a_list[i])

Sorted function

  • it gives a new sorted list form the given sequence
1
2
c_list = sorted([7, 5, 5, 2])
d_list = sorted("adsfasdf afdsfds")

Zip function

  • It pairs up the elements of a number of lists, tuples or other sequences to create a list of tuples
  • It the length of the lists are not the same, the number of elements it produces is determined by the shortest sequence
  • Usually used in for loop, possibly combined with enumerate
    _ Given a “zipped” sequence, zip function can be used to unzip this sequence. And it’s very clever!
1
2
3
4
5
6
7
8
9
10
zipped = zip(a_list, b_list, c_list)
# the length of these lists can be different
list(zipped)

for i, (a, b) in enumerate(zip(a_list, b_list)):
print('{0}: {1}, {2}'.format(i, a, b))

c_list = [('hah', 'xixi'), ('lala', 'meme')]
first_one, second_one = zip(*pitchers)
# the gives of first_one is ('hah', 'lala'). second_one is similar

Reversed

  • reverse the sequence
1
list(reversed(range(10)))

Dict

  • It is also called hash map or associative array
  • Dict is created through { } or by using dict() function, and it’s mutable
    -it is the collection of key-value pairs.
  • Key must be unique, but the value don’ t need.
  • Value can be any type.
  • When we are creating a dict, if we have two same key, the second value will be collected.

Create and add

1
2
3
4
d1 = {'a': 'some value', 'b': [1, 2, 3, 4]}

d1[7] = 'hahah'
# add a pair, which is 7-hahah.

Check and delete

1
2
3
4
'b' in d1

del d1[5]
d1.pop('a')

Merge dicts

1
d1.update({'c': 'xixi', 5: 'lalal'})

Create dicts from sequences

1
2
3
4
5
6
mapping = {}
for key, value in zip(key_list, value_list):
mapping[key] = value
# loop one

mapping = dict(zip(range(5), reversed(range(5))))

Default value

1
2
3
4
5
6
7
8
if key in some_dict:
value = some_day[key]
else:
value = default_value

value = some_dict.get(key, default_value)

# two methods but do the same thing
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
words = ['apple', 'bat', 'bar', 'atom', 'book']
by_letter = { }
for word in words:
letter = word[0] # here word is a 'STR', and it takes the first letter of the word
if letter not in by_letter:
by_letter[letter] = [word]
else:
by_letter [letter].append(word)


for word in words:
letter = word[0]
by_letter.setdefault(letter, []).append(word)


from collections import defaultdict
by_letter = defaultdict(list)
for word in words:
by_letter[word[0]].append(word)

Valid key types

  • Key should bu immutable, like int, float, string, tuples and so on.
  • List is unacceptable, cause it is mutable
  • Term—hashability
  • Use hash function to check
  • List can only be used if you convert it to tuple first.
1
2
3
4
5
hash('string')

hash([1, 2, 3)
hash((1, 2, [3, 4]))
# these two are not hashable, cause it is list or contains list.

Set

List, Set, Dict and Nested list Comprehensions

  • The condition can be omitted
  • Basic form for List: [expr for val in collection if condition]
1
2
3
strings = ['a', 'as', 'bat']
[x.upper() for x in strings if len(x) >2]
# Get the strings with length > 2, and convert them to upper case.
  • Basic form for Set: {exp for value in collection if condition}
1
2
3
4
{len(x) for x in strings}
# Get the length of each elements in strings

set(map(len, strings))
  • Basic form for Dict: {key-expr: value_expr for value in collection if condition}
1
{val: index for index, val in enumerate(strings)}

Nested list comprehensions

  • Arrange according to the order of nesting
  • The order of the for expressions would be the same if you wrote a nested for loop.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# P1, method 1
result = [name for names in all_data for name in names if name.count('e') >= 2]
names_of _interest = []
for names in all_data:
enough_es = [name for name in names if name.count('e') >= 2
names_of_interest.extend(enough_es)

# P1 method 2
[name for names in all_data for name in names if name.count('e') >= 2]
# 这里的names是all_data里面的两个list中的一个,然后name是names这个list里面的某个元素

# P2 method 1
flattened = []
for tup in some_tuples:
for x in tup:
flattened.append(x)

# P2 method 2
flattened = [x for tup in some_tuples for x in tup]

Fucntions

  • It is worth writing a reusable function if repeat the same of similar code more than once.
  • It is declared with the def keyword and returned form with the return keyword.
  • Fucntions can have multiple return statement.
  • If there is no returnin the whole function, then it will return none.
  • Each function can have multiple positional arguments and keyword arguments.
  • Keyword arguments are used to specify default values or optional arguments.
  • The keyword arguments must follow the positional arguments.
1
2
3
4
5
def my_dunction(x, y, z=1.5):
if z > 1:
return z * (x + y)
else:
return z / (x + y)

Return multiple values

1
2
3
4
5
6
7
8
9
10
11
def f():
a = 5
b = 6
c= 7
return a, b, c

a, b, c = f()
# Here, a, b, c are int

return_value = f()
# Here, return_value is a tuple

Functions are objects

  • functions can be used as objects
  • Examples are uniforming the strings.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Method 1
import re
def clean_strings(strings):
results = []
for value in strings:
value = value.strip()
value = re.sub('[!#?]', '', value)
value = value.title()
return result

# Method 2
def remove_punctuation(value):
return re.sub('[!#?]', '', value)

clean_ops = [str.strip, remove_punction, str.title]
# This is the list of operations

def clean_strings(strings, ops):
result = []
for value in strings:
for function in ops:
value = function(value)
result.append(value)
return result

# Method 3
for x in map(remove_punctuation, strings):
print(x)

Anonymous fucntions (Lambda)

  • It is a way to write functios consisting of a single statement.
1
2
3
4
5
6
7
8
9
10
# Method 1
def short_function(x):
return x * 2
#Method 2
equiv_anon = lambda x: x * 2

# Another Example
strings = ['foo', 'card', 'bar', 'aaaa', 'abab']
strings.sort(key=lambda x: len(set(list(x))))
# 不明白为什么要先list,后set。运行没有list也行

Curring: Partial Argument Application

  • Deriving new functions from existing ones by partial argument application.
1
2
3
4
5
6
7
def add_numbers(x, y):
return x+ y

add_five = lambda y: add_numbers(5, y)

from functools import partial
add_five = partial(add_number, x=5)