Artificial Intelligence (AI) is a branch of computer science that aims to create intelligent machines. The basic concept of AI is to build algorithms that can receive input data and use mathematical analysis to predict an output within an acceptable range. Nowadays, it has become an essential part of the technological industries and AI has been one of the most powerful technologies for reshaping engineering outcomes. It can optimize processes throughout domains and is already the engine behind some of the world’s most valuable engineering domains. Further, AI is expected to become a permanent aspect in various domains of the business landscape and AI capabilities need to be sustainable over time to develop and support potential new engineering models and capabilities. AI is generally used in various domains to dig large amount of data to find hidden correlations, patterns, and necessary information.
Specifically, it is believed that companies need to establish dedicated organizational units to entrench AI. This is an important business tool that cannot be left to bottom-up whimsy. Leading organizations are already devoting considerable financial resources to AI, and necessary skills and experience are too rare to assume that they will be scattered around the organization with little coordination or collaboration. Just as e-commerce led to Digital Officers and groups to support online commerce it is believed that AI will engender new Centers of Excellence (CoE), and new roles within them.
To support the academics in AI, the Centre of Excellence (CoE) in Artificial Intelligence at NSUT shall provide the high-performance computing (HPC) facility for the institutions of Degree and Diploma levels under Govt. of NCT of Delhi. The Centre of Excellence is developed with the financial support from the DTTE/DKDF.
High-performance computing applications in Artificial Intelligence shall make use of Deep Neural Network Models, which typically consist of very large matrix multiplication and accumulation operations. These operations can be computed on multiple cores and are embarrassingly parallelizable on Graphical Processing Units. Further, Tensor cores enable these operations in mixed precisions without losing any accuracy. To support and implement such operations following hardware and software components are installed in this centre.
1. Supercomputing system: DGX-A100 by M/s NVIDIA
(The A100 has 8No. GPUs, 40GB Memory/GPU,320GB Total )
2. 100 TB Storage: Model ES7990x by M/s DDN
3. Login & Management Server: Model R740xd by M/s Dell EMC
4. AI Inference Server: Model: Model R740xd by M/s Dell EMC
5. AI Development Workstation: Model 7920 Tower by M/s Dell EMC
6. 100G Open Ethernet Switch: Model: SN2700 by M/e Mellanox
7. 1G Copper Switch-24 Port: Model N3224T by M/s Dell EMC
8. 10G Copper Switch-48 Port: Model S-4148T by M/s Dell EMC
9. KVM Switch: Model AV116 by M/s Vertiv
10. Smart Rack & UPS: Smart Rack - Row by M/s Vertiv
The NVIDIA A100 Tensor Core GPU delivers unprecedented acceleration for AI, data analytics, and high-performance computing (HPC) to tackle the world’s toughest computing challenges. With third-generation NVIDIA Tensor Cores providing a huge performance boost, the A100 GPU can efficiently scale up to the thousands or, with Multi-Instance GPU, be allocated as seven smaller, dedicated instances to accelerate workloads of all sizes.
With Multi-Instance GPU, the eight A100 GPUs in DGX A100 can be configured into as many as 56 GPU instances, each fully isolated with their high-bandwidth memory, cache, and compute cores. This allows the administrator to right-size GPUs with guaranteed quality of service (QoS) for multiple workloads. DGX A100 integrates a tested and optimized DGX software stack, including an AI tuned base operating system, all necessary system software, and GPU-accelerated applications, and pre-trained models.