Python
https://study.com/academy/lesson/pearson-correlation-coefficient-formula-example-significance.html
https://gramener.com/retail/clothing-sales
https://docs.python.org/3/tutorial/index.html
http://interactivepython.org/runestone/static/pythonds/index.html
http://greenteapress.com/thinkpython2/html/index.html
http://worrydream.com/refs/Crowe-HistoryOfVectorAnalysis.pdf
https://en.wikipedia.org/wiki/Transformation_matrix
https://gist.github.com/anonymous/7d888663c6ec679ea65428715b99bfdd
https://www.mathbootcamps.com/proof-every-matrix-transformation-is-a-linear-transformation/
https://archive.ics.uci.edu/ml/datasets/SMS+Spam+Collection#
https://www.slideshare.net/21_venkat/decision-tree-53154033
https://www.thispersondoesnotexist.com/
http://www.whichfaceisreal.com/
https://github.com/the-deep-learners/TensorFlow-LiveLessons/
Actual/pred No Yes
No TN FP
Yes FN TP
Positive Predictive Value
Precision = TP / TP+FP
actual 'Yes' case is predicted correctly
True Positive Rate (TPR) = TP/ TP + FN
sensitivity = TP/ TP + FN
Recall = TP/ TP + FN
Specificity = TN/ TN + FP
Accuracy= TN + TP / (TN + TP + FN + FP)
false postive rate FPR = FP/ TN + FP
wEIGHT OF eVIDENCE = WOE=ln(PercentageofGood / PercentageofBad)
IV=WOE * (Good in the bucket/ Total Good − Bad in the bucket / Total Bad )
list1 = [n*6 for n in range(1,11) if n%2==0]
map(lambda e: e**2, filter(lambda e: type(e) == types.IntType, a_list))
squared_ints = [ e**2 for e in a_list if type(e) == types.IntType ]
master_df.groupby('Region').Sales.sum()
master_df['Profit'] = master_df['Profit'].apply(lambda x: round(x, 1))
master_df.pivot_table(values = 'is_profitable', index = 'Region', columns = 'Customer_Segment', aggfunc = 'sum')
df_3 = pd.merge(df_2, shipping_df, how='inner', on='Ship_id')
pd.concat([df1, df2], axis = 0)
companies = pd.read_csv("companies.txt", sep="\t", encoding = "ISO-8859-1")
import pymysql
# create a connection object 'conn'
conn = pymysql.connect(host="localhost", # your host, localhost for your local machine
user="root", # your username, usually "root" for localhost
passwd="yourpassword", # your password
db="world") # name of the data base; world comes inbuilt with mysql
# create a cursor object c
c = conn.cursor()
# execute a query using c.execute
c.execute("select * from city;")
# getting the first row of data as a tuple
all_rows = c.fetchall()
# to get only the first row, use c.fetchone() instead
import requests, bs4
# getting HTML from the Google Play web page
url = "https://play.google.com/store/apps/details?id=com.facebook.orca&hl=en"
req = requests.get(url)
# create a bs4 object
# To avoid warnings, provide "html5lib" explicitly
soup = bs4.BeautifulSoup(req.text, "html5lib")
import PyPDF2
# reading the pdf file
pdf_object = open('animal_farm.pdf', 'rb')
pdf_reader = PyPDF2.PdfFileReader(pdf_object)
# Number of pages in the PDF file
print(pdf_reader.numPages)
# get a certain page's text
page_object = pdf_reader.getPage(5)
# Extract text from the page_object
print(page_object.extractText())
df.isnull().sum()
df.isnull().any()
df.isnull().any(axis=0)
df.isnull().any(axis=1)
df.isnull().all(axis=1)
df.isnull().all(axis=1).sum()
df.isnull().sum(axis=1)
round(100*(df.isnull().sum()/len(df.index)), 2)
df[df.isnull().sum(axis=1) > 5]
len(df[df.isnull().sum(axis=1) > 5].index)
df = df[df.isnull().sum(axis=1) <= 5]
df = df[~np.isnan(df['Price'])]
df.loc[np.isnan(df['Lattitude']), ['Lattitude']] = df['Lattitude'].mean()
df['Car'] = df['Car'].astype('category')
df['Car'].value_counts()
year_month = pd.pivot_table(df, values='Sales', index='year', columns='month', aggfunc='mean')
plt.figure(figsize=(12, 8))
sns.heatmap(year_month, cmap="YlGnBu")
plt.show()
68% probability of the variable lying within 1 standard deviation of the mean
95% probability of the variable lying within 2 standard deviations of the mean
99.7% probability of the variable lying within 3 standard deviations of the mean
P(-0.5 < Z < 0.5) = P(Z < 0.5) - P(Z < -0.5)
P(Z<=2.33) + P(Z>2.33) = 1,
P(Z>2.33) = 1 - P(Z<2.33)
P(Z>-3) = 1 - P(Z<-3)
P(-2.2 < Z < 1.8) = P(Z < 1.8) - P(Z < -2.2)
Deep Learning
https://learning.oreilly.com/library/view/deep-learning-illustrated/9780135116821/
https://resources.oreilly.com/live-training/deep-learning
https://github.com/the-deep-learners/TensorFlow-LiveLessons
https://ai.googleblog.com/2016/06/wide-deep-learning-better-together-with.html