Python is a versatile and powerful programming language widely used in data science, machine learning, and scientific computing. This lab manual covers fundamental concepts and hands-on exercises to help you develop proficiency in Python for data analysis.
To start Python:
On Windows you can start Python by using the command: python
This will start a prompt with three greater than signs >>>.
>>> prompt means that Python code and commands can be executed in the interactive session:
C:\> python
>>> 3+3
6
To exit Python:
Windows: Press CONTROL-Z + Enter, or
Use the command exit() in the interactive session
Another way of running Python is by writing code in a text file and save it with .py extension, i.e. Salam.py.
#Salam.py
print ("Aslam O Alaikum, Happy learning Python programming language!")
Execute the file using the command
python filepath/Salam.py
It should produce the output as follows:
Aslam O Alaikum, Happy learning Python programming language!
Python is very flexible in defining data types. There is no need to define the type or size of the variable for data holding.
x = 34 - 23 # Integer
y = "Aslam" # String
z = 3.45 # Float
# Multiple variable assignments on a single line are possible
x, y = 2, 3
a = 5 + 3 # Addition
b = 7 - 2 # Subtraction
c = 3 * 4 # Multiplication
d = 8 / 2 # Division
e = 10 % 3 # Modulus
x = True
y = False
print(x and y) # Prints False
print(x or y) # True
print(not x) # False
Python uses indentation, that is space at the start of new line, to define code blocks. In following code, the print instruction is inside the if block. So it will execute only if the condition is true, that is value of x is more than 10:
if x > 10:
print("X is greater than 10")
The hash sign # is used for defining comments. Anything in a line after # is ignored by the computer for execution. Multi line comments are generally no supported. However, anything inbetween tripple signgle quote or tripple double quotes is considered a string and is not executed.
# This is a comment
"""This is a
multi-line comment"""
x = 34 - 23 # A comment.
y = “Aslam” # Another one.
z = 3.45
if z == 3.45 or y==“Aslam”:
x = x + 1
y = y + “ O Alaikum” # String concat.
print (x)
print (y)
Python differs from conventional programming languages in that it does not support arrays. It has Lists and Tuples instead of arrays. Lists are very flexible. These can hold all types of data and there is no need to define list size. It will automatically accommodate new data. Creating and manipulating lists:
my_list = [1, 2, 3, 4, 5]
my_list.append(6) # Add element
my_list.remove(2) # Remove element
print(my_list[1:3]) # Slicing
Tupples are like lists but these are defined only one time and these ca not be changed later on, hence the name immutable sequences:
my_tuple = (1, 2, 3)
print(my_tuple[0])
Dictionaries are key-value pairs:
my_dict = {"name": "Ali", "age": 25} #define a dictionary and populate 2 entries in it
print(my_dict["name"]) #print by property
my_dict["city"] = "Karachi" #Add new dictionary entry
Unique unordered collections:
my_set = {1, 2, 3, 4}
my_set.add(5)
print(my_set)
The for loop works on a counting variable and a range or list to iterate over.
for variable_name in list:
# do something here
Counting loop use special function called range() that takes up to three parameters,
range(start, stop, step)
Where ,
Start is the starting number (an integer),
Stop is the last number in the series,
Step is the increment.
Example:
for i in range(2,9,2):
print (i) # i iterates through 2,4,6,8 here
Python also provides mechanism for controlling the flow of the loop:
break: Exit the loop body immediately.
continue: Skip to next iteration of the loop.
else: Execute code when the loop ends.
my_list = ["abacab", 575, 24, 5, 6]
for item in my_list:
print("{0}".format(item))
values = (12, 450, 1, 89, 2017, 90125)
for value in values:
print("{0} in binary is {1}".format(value, bin(value)))
print("{0} in octal is {1}".format(value, oct(value)))
print("{0} in hexadecimal is {1}".format(value, hex(value)))
Output:
12 in binary is 0b1100
12 in octal is 0o14
12 in hexadecimal is 0xc
450 in binary is 0b111000010
450 in octal is 0o702
450 in hexadecimal is 0x1c2
It is possible to define Python functions by a user. Let us demonstrate by writing a function that computes factorial of a given integer.
Save the following code in a text file named fact.py
# Example script (fact.py):
def fact(x):
if x == 0:
return 1
return x * fact(x - 1)
print("N, fact(N)")
print("---------")
for n in range(4):
print(n, fact(n))
def starts the function definition.
fact is the the function name of your choice.
x is the input parameter.
return sends the result back to where the function was called.
fact(x) is a recursive function that returns the factorial of x.
fact(0) returns 1 (base case).
Otherwise, it returns x * fact(x - 1).
The for loop prints factorials of numbers from 0 to 9.
Run the Python script in command window using the command:
python fact.py
The expected output is:
N, fact(N)
---------
0 1
1 1
2 2
3 6
with open("data.txt", "r") as file:
content = file.read()
with open("output.txt", "w") as file:
file.write("Hello, world!")
Importing NumPy and creating numpy arrays:
import numpy as np
arr = np.array([1, 2, 3, 4])
print(arr.shape)
Creating a DataFrame:
import pandas as pd
data = {"Name": ["Ali", "Sara"], "Age": [25, 22]}
df = pd.DataFrame(data)
print(df)
Plotting a simple graph:
import matplotlib.pyplot as plt
x = [1, 2, 3, 4]
y = [10, 20, 30, 40]
plt.plot(x, y)
plt.show()
This lab manual covers fundamental Python concepts for data science. As you progress, explore more advanced topics such as machine learning, deep learning, and big data processing using Python.