Sources: https://media.readthedocs.org/pdf/dryscrape/latest/dryscrape.pdf, https://learnpythonthehardway.org/, http://www.python-course.eu/lambda.php
Subpages (2): setups use hbase use Sqlite Python-Turtle
Last updated : 04-June-2022
#Getting help is so easy in python shell using help function.
help('import')
help(requests.utils)
help(BeautifulSoup)
requests module. Requests is an HTTP library, written in Python, for human beings.
import requests as req
ses=req.Session()
url="http://www.google.co.in"
resp=ses.get(url)
resp.status_code #returns status code for request
resp.text #returns response body
resp.cookies.items()
resp.headers.items()
cookies1=resp.cookies #reference copied
cookies1=resp.cookies.copy() #value copied
cookies1.pop('key') # removes key from dictionary
cookieDictionary = requests.utils.dict_from_cookiejar(resp.cookies) # extract all cookies(in response) into dictionary.
cookieJarTmp=None
cookieJar=requests.utils.add_dict_to_cookiejar(cookieJarTmp, cookieDictionary) # creates cookiejar to send(in request) from dictionary.
Unpacking of values:
>>> fruits = ["apple", "banana", "custard"]
>>> a,b,c=fruits #number of variables in left should be same as number of elements in right.
>>> print a
apple
>>>
Command line Arguments: test_argv.py
>>> import sys
>>> sys.argv
['']
>>>
read inputs from keyboard: input() vs raw_input()
>>> a=input("Your fav fruit: ")
"apple"
>>> a
'apple'
>>> a=input("Your age: ")
23
>>> a
23
>>> b=raw_input("Your name: ")
Rohit
>>> b
'Rohit'
>>> a=input("Your name again: ")
Rohit
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 1, in <module>
NameError: name 'Rohit' is not defined
>>> a=input("Your name again: ")
>>> b #'b' accepted as variable, not b's value assigned to a.
>>> a
'Rohit'
>>>
Parameter passing (Only Reference supported):
Get handle on file:>write to file:>Read from file:
>>> content="This will be written to then read from same file."
>>> file=open('/tmp/test.txt', 'w') # open file for write.
>>> file.write(content.encode('utf-8'))
>>> file.close()
>>> file=open('/tmp/test.txt', 'r') # open file for read.
>>> line = file.read()
>>> line
'This will be written to then read from same file.'
>>> file.close()
Conditional statements:
loop statements:
>>> for i in range(5): #range function generates list of 5 numbers starting from 0
... i
...
0
1
2
3
4
>>> for i in range(2,5): #range function generates list of numbers starting from 2 until 5
... i
...
2
3
4
>>> for i in range(2,5,2): #range function generates list of even numbers starting from 2 until 5.
... i
...
2
4
>>>
Unique values, set datatype or unique elements:
set(<ARRAY>)
or,
use dictionary with value as constant 0 and ARRAY-ELEMENTS as key. finally <dictionaryOBJ>.keys().
Lambda, filter, reduce and map
a=[1,2,3,4]
def sqr(x): return x**2
map(sqr, a) #Prints square of each number in list a.
Use map to convert numeric String list into int list.
a=['1', '2', '3', '4']
b=map(int, a) #this converts each string element in list into integer and stores in b.
Use BeautifulSoup for Parsing web documents. Install bs4, html5lib using pip.
from bs4 import BeautifulSoup
soup=BeautifulSoup("content to parse", "html.parser")
element=soup.find(id='<id of element>') #eg- table's id, will return entire table tag.
childElement=element.select('<Child Element Tag>') #eg- tr, will return array of tr tags within table.
childElement.get_text(strip=True).replace('\n', '').encode('ascii', 'ignore') #returns deepest body of the tag.
soup.find_all('<ELEMENT_NAME>', <ATTRIBUTE_NAME>=<ATTRIBUTE_VALUE>). #returns array of elements matching the given criteria.
trs = soup.findAll('tr') #Get all tr Tags
trCount = len(trs) #Get Count of tr tags
trs[trCount-2].text #Get second Last tr.
Tuple (), List [] and Dictionary {}
Tuple is immutable collection.
tupleX=("Rohit", "Rahul")
tupleX[1]="test reassignment"; ## Change of value is not allowed.
del tupleX[0] ##Change in structure is not allowed as well.
tupleX=("abc") ##its splitted into characters if only one element is in tupple
tupleX[0]
'a'
tupleX=("abc",) ##To keep as is it has to be suffixed with coma.
tupleX[0]
'abc'
List is mutable.
listX=["Rohit","Rahul"]
listX[0]="ROHIT" ##Change in value is allowed.
del listX[0] ##Change in structure is allowed as well.
Dictionary is key-value collection.
dictX={"1":"Rohit", "2":"Rahul"}
dictX["1"]="ROHIT" ##updates value attached to key "1" to "ROHIT"
del dictX["1"] ##Removes element with key "1"
Working with DateTime :
import datetime
print(datetime.date.today())
print(datetime.datetime.now()) #now(), actually returns datetime object initialied with local tz's date time
now = datetime.datetime.now()
print(now.year) #prints current local tz's year
print(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
#Debug your py script using Visual Studio Code.
1. Open py file to debug.
2. Hit F5. Click 'more' in debugger selection drop down that appears on top
4. Install python extension by 'Don Jayamanne'. Click 'Reload' post installation to restart visual studio code.
5. Hit F5 to begin debug. Add breakpoints wherever needed by click on line numbers shown on left.
-#Error requests module not found.
- Close studio code > In terminal, get into the python environment where requests module was installed > Start code from within that environment and you will get all installed modules.
Problem-1: pip install mysql-python on windows resulted in error "INCLUDE not found in environment"
Soln: easy_install mysql-python #Instead pip use easy_install.
Problem-2: While writing to file, Ascii encoding fails if some special character is to be written.
Soln: encode content to be written into utf-8.
Problem-3: pylint - 1,0,convention,C0111:Missing module docstring.
Soln: Added simple description about code within """""" (pair of three quotes).
Problem-4:
def work(age) :
""" Function work(age) is to demonstrate exception handling and the ways of documentation in python too.\n \
It takes age as argument and if its not in range of 18 to 60, throws custom exception.\n \
If age is zero then it throws DivideByZero Exception.\n \
"""
x=-1
try:
x = 100 / age #throws DivideByZero if input age is 0
if age < 18 or age > 60:
raise Exception("Age is beyond range 18-60.") #throw Exception if input age is not in range 18-60
except ZeroDivisionError:
print("Threw divide by zero and was handled.")
except Exception as e:
print("Exception {} was thrown and handled here.".format(e))
else:
print("Age - {}, X - {}".format(age, x)) #if no exception is thrown, this line executes.
finally:
print("Success!") #This executes in both cases(exception or not)
#Tests
work(0)
work(17)
work("20")
work(20)
#Test documentation
help(work)
print(work.__doc__)
work.__doc__ = "Changed __doc__ property " # instead of """ """ we can directly define using __doc__ within function or outside function with ref.
print(work.__doc__)
help(work)
from boto.s3.connection import S3Connection
con=S3Connection('<ACCESS_KEY>','<SECRET_KEY>')
from boto.s3.key import Key
bucket=con.get_bucket('<BUCKETNAME>')
key=Key(bucket)
key.key='<DEST_FILE_NAME>'
key.set_contents_from_filename('<SOURCE_FILE_PATH>')
os package .
os.linesep #returns default line separator used on current machine.