It demonstrates a complete binary classification pipeline using PyTorch on tabular data, covering data preprocessing, custom Dataset and DataLoader creation, and neural network modeling.It highlights hands-on usage of model training loops, loss optimization (BCELoss + Adam), validation/testing workflows, and performance visualization through accuracy and loss curves.
This mini-project demonstrates an end-to-end binary classification pipeline on tabular medical data using PyTorch. It covers dataset acquisition from Kaggle, exploratory preprocessing, label encoding (malignant vs benign), feature normalization, and train/validation/test splitting. The project implements a custom PyTorch Dataset and DataLoader, followed by a fully connected neural network with multiple hidden layers and sigmoid output for probabilistic prediction. It highlights hands-on training loops with BCELoss and Adam optimizer, device-aware (CPU/GPU) execution, and systematic evaluation using accuracy on validation and test sets. Model performance is analyzed through loss and accuracy curves, providing clear insights into convergence behavior and generalization on real-world clinical data.
This mini-project demonstrates a clean binary classification experiment on a synthetic non-linearly separable dataset (two-moons) generated using sklearn.datasets.make_moons. It covers creating a toy dataset, packaging it into a Pandas DataFrame, performing simple feature scaling/normalization, and building a custom PyTorch Dataset + DataLoader for mini-batch training. The model is a small MLP (fully connected network) with ReLU activations and a sigmoid output layer, trained using a standard training loop with BCELoss and Adam optimizer on CPU/GPU. The experiment emphasizes how multi-layer perceptrons learn non-linear decision boundaries, and visualizes training behavior through a loss vs epochs curve for quick convergence inspection.
This mini-project builds a complete binary classification pipeline in PyTorch using the Kaggle Stroke Prediction dataset downloaded via opendatasets. It demonstrates practical tabular preprocessing steps such as imputing missing BMI values, dropping unused identifiers, converting categorical variables into numeric form using mapping and one-hot encoding, and casting the final feature matrix to float tensors for training. A lightweight MLP (19β64β32β1) with ReLU activations is trained on GPU/CPU using BCEWithLogitsLoss and Adam, while torchinfo.summary is used to verify tensor shapes and parameter counts. Training dynamics are tracked by logging epoch-wise loss and plotting a loss curve, making this a clean hands-on example of taking a real-world healthcare tabular dataset from raw CSV to a working neural baseline in PyTorch.
This mini-project builds a practical fraud-detection pipeline in PyTorch using the Kaggle Credit Card Fraud dataset, focusing on real-world challenges of highly imbalanced binary classification. It covers loading and preparing tabular features, standardizing inputs with StandardScaler, splitting into train/test sets, and training a regularized MLP with batch normalization and dropout for stable optimization on large batches. To address imbalance, it uses BCEWithLogitsLoss with a computed pos_weight, and trains with AdamW including weight decay for better generalization. The notebook uses torchinfo.summary to validate model shapes and parameter counts, and tracks both loss and recall over epochs (with thresholded sigmoid outputs) to emphasize minority-class detection performance, visualizing train vs test trends for both metrics to understand overfitting and detection quality.