Lecture 11

Using Random Forest for credit risk models Since the financial crisis, regulators have put an important focus on risk management supervision and expect banks to have transparent, auditable risk measurement frameworks dependent on portfolio characteristics for regulatory, financial or business decision-making purposes. Quantitative modelling techniques are used to get better insights from data, reduce cost and increase overall profitability. In this disruptive era of Big Data and Artificial Intelligence, banks are considering the adoption of evolving technological capabilities whilst arbitrating between heightened regulatory demands and business objectives. Point of View: Using Random Forest for credit risk models M
read more

This Article from IBM, shows how AI can help in Fast Fashion

High-tech solutions aren’t yet mainstream in the world of fashion, but we believe they should be. We first became convinced of Watson’s power after a visit to the IBM Labs in Bangalore. Bearing data about our brands, we presented a problem to solve: of two similar products, why did one sell and the other didn’t? IBM developed a proof-of-concept solution that gave us the answer. Clearly, AI could help our company succeed.
read more

What Is A Random Forest?

Architecturally, the CPU is composed of just a few cores with lots of cache memory that can handle a few software threads at a time. In contrast, a GPU is composed of hundreds of cores that can handle thousands of threads simultaneously.

The NVIDIA RAPIDS™ suite of open-source software libraries, built on CUDA-X AI, provides the ability to execute end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA CUDA primitives for low-level compute optimization, but exposes that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.

With the RAPIDS GPU DataFrame, data can be loaded onto GPUs using a Pandas-like interface, and then used for various connected machine learning and graph analytics algorithms without ever leaving the GPU. This level of interoperability is made possible through libraries like Apache Arrow and allows acceleration for end-to-end pipelines—from data prep to machine learning to deep learning.

Industries in transformation: Technology Services

In 2023, GenAI established itself as a transformative force, becoming a strategic priority for IT/ITeS companies globally. Its rapid adoption reflects the urgency felt by enterprises to embrace this shift in the enterprise technology landscape. Companies worldwide rushed to understand and capitalize on this transformative wave, leveraging technology service providers to conduct proofs of concept (POCs) focused on cost optimization, productivity enhancement and efficiency gains.Year 2024 saw enterprises transitioning from exploration to scaling successful pilot programs. GenAI became a key agenda item in boardrooms and strategy discussions, with organizations investing heavily to capture the opportunities it presents. Service providers, in turn, are poised to redefine their portfolios, unlock profitability and create value across the next five years. Indian technology service companies have taken center stage in this movement, not only executing GenAI pilots for their clients but also incorporating GenAI into their internal operations.
read more

Introduction to Boosted Trees

The model in supervised learning usually refers to the mathematical structure of by which the prediction

is made from the input

. A common example is a linear model, where the prediction is given as

y^i=∑jθjxij

, a linear combination of weighted input features. The prediction value can have different interpretations, depending on the task, i.e., regression or classification. For example, it can be logistic transformed to get the probability of positive class in logistic regression, and it can also be used as a ranking score when we want to rank the outputs.

read more

Ensembles: Gradient boosting, random forests, bagging, voting, stacking

Scikit-learn 0.21 introduced two new implementations of gradient boosted trees, namely HistGradient Boosting Classifier and Hist Gradient Boosting Regressor, inspired by LightGBM (See [LightGBM]).

These histogram-based estimators can be orders of magnitude faster than Gradient Boosting Classifier and GradientBoostingRegressor when the number of samples is larger than tens of thousands of samples.

They also have built-in support for missing values, which avoids the need for an imputer.

These fast estimators first bin the input samples X into integer-valued bins (typically 256 bins) which tremendously reduces the number of splitting points to consider, and allows the algorithm to leverage integer-based data structures (histograms) instead of relying on sorted continuous values when building the trees. The API of these estimators is slightly different, and some of the features from Gradient Boosting Classifier and Gradient Boosting Regressor are not yet supported, for instance some loss functions.
read more

Page updated

Google Sites

Report abuse