Hi there. Thank you for visiting this site. I am Sayak (সায়ক). In my current role at DataCamp, I develop projects for DataCamp Project. My first DataCamp project Predicting Credit Card Approvals got launched recently. I create exercises for DataCamp Practice. I also write technical tutorials for DataCamp Community on a daily basis. Prior to DataCamp, I have worked at TCS Research and Innovation (TRDDC) as a developer where the domain of work was Cyber Security (specifically Data Privacy). There, I was a part of TCS's critically acclaimed GDPR solution called Crystal Ball. Prior to that, I have worked as a Web Services Developer at TCS (Kolkata area). I am also working with Dr. Anupam Ghosh and my beloved college juniors for Applied Machine Learning research/tinkering. Currently, we are working on the application of machine learning in Phonocardiogram classification.
My subject of interest broadly lies in areas like Machine Learning Interpretability, Full-Stack Data Science. I aspire for a career in Data Science where I should be able to interpret models and communicate the results effectively.
Links to my DataCamp blogs:
- KMeans clustering with scikit-learn - https://goo.gl/dT7kYq
- DBSCAN: A macroscopic investigation in Python - https://goo.gl/fDGYUn
- Hyperparameter Optimization in Machine Learning Models - https://goo.gl/5C6ouV
- Towards Preventing Overfitting: Regularization - https://goo.gl/B9vxia
- Ensemble Learning in Python - https://goo.gl/dmH9db
- Investigating Tensors with PyTorch - https://goo.gl/yoYsVL
- Introduction to Feature Selection - https://goo.gl/gY8rwy
- Demystifying crucial Statistics in Python* - https://goo.gl/i2Wm5v
- Diving Deep with Imbalanced Data - https://goo.gl/fZnYmV
- Introduction to Cyclical Learning Rates - https://goo.gl/2fpkQQ
- Turning Machine Learning Models into APIs in Python** - https://goo.gl/vwzqtA
- Essentials of Linear Regression in Python*** - https://goo.gl/5nuVmt
- Simplifying Sentiment Analysis in Python - https://goo.gl/62mEJo
- Automated Machine Learning with Auto-Keras - https://goo.gl/XEjea4
- Introduction to Indexing in SQL - https://goo.gl/7dcnE7
- Understanding Recursive Functions in Python - https://goo.gl/u1U2eH
- Beginner's Guide to Google's Vision API in Python - https://goo.gl/VCwZa8
- Beginner's Guide to PostgreSQL - https://goo.gl/DV1rhY
- Managing Databases in PostgreSQL - https://goo.gl/YA9fAy
- Working with Spreadsheets in SQL - https://goo.gl/PYUb2v
- Installing PostgreSQL on Windows and Mac OS X - https://goo.gl/CyF8T4
- Using Order By Keyword in SQL - https://goo.gl/i7mD8f
- Introduction to Alter Table Statement in SQL - https://goo.gl/qWi3km
- SQLite in Python - https://goo.gl/wYCr4e
- Introduction to Where Clause in SQL - https://goo.gl/VB3CdX
- Introduction to SQL Joins - https://goo.gl/2w342W
- 10 command-line utilities in PostgreSQL - https://goo.gl/xFWbRS
- CASE Statements in PostgreSQL - https://bit.ly/2HWBSwu
- Aggregate Functions in SQL - https://bit.ly/2GnDqg9
- Cleaning Data in SQL - http://bit.ly/2GyPdrL
- Materialized Views in PostgreSQL - http://bit.ly/2VFz11x
- Argument Parsing in Python - http://bit.ly/2LOWGsJ
*This article got featured in "Python Top 10 Articles for the Past Month (v.Oct 2018)" and secured a rank of 4. **This article got featured in "Machine Learning Top 10 Articles for the Past Month (v.Nov 2018)" and secured a rank of 9.***This article got featured in "Python Top 10 Articles for the Past Month (v.Dec 2018)" and secured a rank of 10.
- Introduction to Anomaly Detection in Python (FloydHub) - https://bit.ly/2TZLg4d
- Introduction to K-Means Clustering in Python with scikit-learn (FloydHub) - https://bit.ly/2IZev5a
- An introduction to Q-Learning: Reinforcement Learning (FloydHub) - http://bit.ly/2HxuVzo
- A comprehensive list of data science resources for developers (for Intel DevMesh) - https://intel.ly/2J2UYSs
- Detecting phishing websites using machine learning (Intel Software Innovators' Medium Channel) - http://bit.ly/2YBvaAs
- Introduction to procedures and cursors in SQL (Towards Data Science) - https://bit.ly/2OTd8WF
Some research problems/problems I wish to solve:
- Surveillance for water wastage: Water wastage is a vicious problem. In spite of several campaigns and infinite awareness activities, water wastage is still an avid problem. In countries like India especially in its rural areas, this problem imposes a great threat. The aim of this work is to facilitate modern image processing and information retrieval techniques to extract the relevant images from satellite image data and to build an effective surveillance system to reduce the amount of water wastage.
- Information extraction from Annual Report: Most companies report their annual financial statements every year formally on their company website. This is typically published in a PDF format, with the financial data usually presented in the form of tables. The financial reports of companies are utilized by banks and other financial institutions to evaluate company performances to enable these institutions to approve loans or manage other transactions with these institutions. A huge amount of manual effort is spent by financial institutions today to fetch these financial reports and extract the financial data from reports. The objective is to automate this extraction process to minimize the manual effort. This will enable companies to increase their productivity and save considerable effort.
- Generate Corporate profiles from the Web: When a company engages with their clients and establishes a relationship, it does an initial KYC (Know Your Customer), to get background information about the company and its key stakeholders and employees, like the list of C-Level executives of their client and their designations, HQ address, Phone numbers etc. The KYC is done manually by users for every client, which usually runs into hundreds of thousands of clients for some large companies. Fetching profile information from either company websites or from public search engines is a tedious effort and takes considerable time. The objective of this use case is to automate the information extraction process and save on effort and increase productivity.
- Towards intelligent food safety and food distribution: Food wastage and poor quality are genuine problems in many countries like India. How can we facilitate AI techniques in order to maintain a good safety and distribution trade-off in food-care.
*(I am open to discuss/collaborate on these problems)
- Paul S., Banerjee C., Ghoshal M. (2018) A CFS–DNN-Based Intrusion Detection System. In: Bera R., Sarkar S., Chakraborty S. (eds) Advances in Communication, Devices and Networking. Lecture Notes in Electrical Engineering, vol 462. Springer, Singapore.
- Gupta J., Paul S., Ghosh A. (2019) A Novel Transfer Learning-Based Missing Value Imputation on Discipline Diverse Real Test Datasets—A Comparative Study with Different Machine Learning Algorithms. In: Abraham A., Dutta P., Mandal J., Bhattacharya A., Dutta S. (eds) Emerging Technologies in Data Mining and Information Security. Advances in Intelligent Systems and Computing, vol 814. Springer, Singapore.
- C. Baneriee, S. Paul and M. Ghoshal, "A Comparative Study of Different Ensemble Learning Techniques Using Wisconsin Breast Cancer Dataset," 2017 International Conference on Computer, Electrical & Communication Engineering (ICCECE), Kolkata, India, 2017, pp. 1-6.
- Sengupta, S.; Basak, S.; Saikia, P.; Paul, S.; Tsalavoutis, V.; Atiah, F.D.; Ravi, V.; Peters II, R.A. A Review of Deep Learning with Special Emphasis on Architectures, Applications and Recent Trends. Preprints 2019, 2019020233 (doi: 10.20944/preprints201902.0233.v1).
Seminars/ Talks/ Workshops:
- 2 days workshop on Ethical Hacking, organized by Netaji Subhash Engineering College, 15 - 16th April, 2014.
- 2 days workshop on Android, organized by Netaji Subhash Engineering College, 23 - 24th May, 2014.
- Technical Lecture meeting on Emerging Trends in Next-Gen Computing Systems: Multicore, IoT, Big Data and Cloud perspective, organized by IIEST, Shibpur, 8th January, 2016.
- 1 day awareness workshop on Smart India Hackathon, organized by Heritage Institute of Technology, 10th October, 2016.
- 1 day workshop on DevOps, organized by IIT-KGP, 17th January, 2017.
- ACM India Annual Event, hosted by Amity University (met Turning award winner Leslie Valiant), 21st January, 2017.
- 5 days workshop on Big Data, organized by Netaji Subhash Engineering College, 16 - 21th February, 2017.
- 1 day workshop on Distributed Computing, organized by Netaji Subhash Engineering College, 26th April, 2017.
- 1 day ACM virtual workshop on Green Computing, 3rd May, 2017.
- 1 day workshop on Virtual Labs, organized by Netaji Subhash Engineering College, 20th May, 2017.
- International Conference on Communication Devices and Networking, SMIT, Sikkim, 3rd June, 2017.
- International Conference on Computer, Electrical & Communication Engineering , TIU, Kolkata, 23rd December, 2017.
- INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES IN DATA MINING AND INFORMATION SECURITY , UEM, 23rd February, 2018.
- "Multi-Agent Imitation Learning for Driving Simulation" by Raunak P. Bhattacharyya , TRDDC, Pune, 15th July, 2018.
- Delivered a speech on "Industrial exposure of Applied Machine Learning" at MLCC Study Jam organized at Netaji Subhash Engineering College (Kolkata) by the dept. of Information Technology, 6th August, 2018.
- Conducted a short session on "Hyperparameter optimization in Machine Learning models" at the final MLCC Study Jam organized at Netaji Subhash Engineering College (Kolkata) by the dept. of Information Technology, 23rd August, 2018.
- Google Cloud Meetup by Abhishek Nandy (Google Cloud Community Lead, Kolkata) at organized at Netaji Subhash Engineering College (Kolkata).
- Delivered a talk on "Cyclical Learning Rates for training Neural Nets" at Google DevFest 2018, Kolkata, India, 3rd November, 2018.
- Conducted a hack-session on "Cyclical Learning Rates" at DataHack Summit (organized by Analytics Vidhya), Bangalore, 23rd November, 2018.
- Organized Full-Stack Data Science Workshop along with the The Code Foundation and GDG Kolkata, Kolkata, 19th January, 2019.
- Delivered talks on "Introduction to BigQuery" at GDG Kolkata Cloud Study Jam (Academy of Technology), Google Cloud Next '19 Extended - Kolkata on 12th April and 20th April 2019 respectively.
- Conducted a session on "Ten Updated Introduced in TensorFlow 2.0" along with quizzes at Google I/O Extended 2019 on 11th May, 2019.
- Data Science Intern at CareerIn (Dec, 2016 - Feb, 2017).
- Intern at Notice App (Nov, 2016 - Jan, 2017).
- ISP (Internshala Student Partner) at Internshala (March, 2017 - August, 2017).
- Contributed in the development of two major college websites (nsecit.in, placement.nsec.ac.in)
- Stanford Research Slack collaborator
- Student Course Coordinator of the department
- Placement Representative of the department
- Beta -tester of DataCamp's course "Deep Learning in Python"
- Taught under-privileged children and managed operations for a TCS-CSR initiative called H20 (helping Hand Organization)
- Moderator of the Artificial Intelligence channel of Campus Commune (Youngest)
- Invited mentor for Smart India Hackathon 2018 (Youngest)
- Advisor to Overleaf, an online LaTex platform
- Contributing author at Towards Data Science
- Invited mentor for Smart India Hackathon 2019
- Mentor for GirlScript Summer of Code 2019
- Book reviewer at Manning Publications Co.
- I love listening to all genres of music. A guitar player myself. Have played in a band "Behest" from 2013 to 2017. I have performed at several college festivals including IIT-Delhi, JU etc with Behest. I love watching TV series also (Narcos, Game of Thrones, Fringe being all time favorites).
- I am open to discuss new ideas and collaborate for projects. If anyone's interested, just contact me using the below details.
- Tapas Kumar Paul
- Baby Paul