Visit Official SkillCertPro Website :-

NVIDIA Professional: AI Infrastructure (NCP‑AII) Practice Tests 2026

NVIDIA Professional: AI Infrastructure (NCP‑AII) Exam Questions 2026

NVIDIA Professional: AI Infrastructure (NCP‑AII) Questions 2026 Contains 350+ exam questions to pass the exam in first attempt.

SkillCertPro offers real exam questions for practice for all major IT certifications.

For a full set of 360 questions. Go to

https://skillcertpro.com/product/nvidia-ai-infrastructure-ncp-aii-exam-questions/

SkillCertPro offers detailed explanations to each question which helps to understand the concepts better.

It is recommended to score above 85% in SkillCertPro exams before attempting a real exam.
SkillCertPro updates exam questions every 2 weeks.
You will get life time access and life time free updates
SkillCertPro assures 100% pass guarantee in first attempt.

Below are the free 10 sample questions.

Question 1:

A system administrator is configuring a cluster where specific nodes require high-throughput storage access for large datasets. They decide to use BlueField DPUs to implement NVMe-over-Fabrics (NVMe-oF) storage acceleration. Which step is essential for configuring the BlueField network platform to support this specific offload capability?

A. Disable the SNAP (Storage, Network, and Analytics Performance) service on the DPU to allow the host to handle storage interrupts.

B. Configure the DPU in Embedded Function mode and use DOCA drivers to expose virtual NVMe controllers to the host OS.

C. Connect the DPU to the BMC via a serial cable to allow the BMC to manage the NVMe flash translation layer.

D. Install the CUDA Toolkit directly on the BlueField ARM cores to process storage encryption using the integrated Tensor Cores.

Answer: B

Explanation:

B. Configure the DPU in Embedded Function mode and use DOCA drivers to expose virtual NVMe controllers to the host OS. For a BlueField DPU to perform storage offloading (specifically using NVIDIA DOCA SNAP), it must be in Embedded Function Mode (also known as DPU Mode). In this state, the DPUs ARM cores run their own OS and management stack. Using the DOCA framework, the DPU emulates a local physical NVMe drive on the host‘s PCIe bus. The host OS “sees“ a standard NVMe controller and interacts with it using native drivers, while the DPU transparently handles the complex NVMe-over-Fabrics translation and network data movement in the background.

Incorrect:

A. Disable the SNAP (Storage, Network, and Analytics Performance) service on the DPU to allow the host to handle storage interrupts. NVIDIA SNAP is the very technology required to enable storage acceleration. Disabling it would prevent the DPU from intercepting and accelerating storage traffic. The goal of the DPU is to offload these interrupts from the host CPU, not to pass them back.

C. Connect the DPU to the BMC via a serial cable to allow the BMC to manage the NVMe flash translation layer. The Baseboard Management Controller (BMC) is for out-of-band server management (power, thermals, firmware updates) and does not have the computational power or architectural path to manage high-speed NVMe flash translation or storage protocols. These tasks are handled by the DPU‘s ARM cores and specialized hardware accelerators.

D. Install the CUDA Toolkit directly on the BlueField ARM cores to process storage encryption using the integrated Tensor Cores. While BlueField DPUs have ARM cores and hardware acceleration engines, they do not contain Tensor Cores (which are specific to NVIDIA GPUs). Storage encryption on a BlueField DPU is typically handled by dedicated hardware crypto-engines via DOCA libraries, not by running CUDA kernels on the ARM cores.

Question 2:

During the physical installation of GPU-based servers, a technician must validate that the cooling parameters meet the requirements for NVIDIA H100 GPUs. If the BMC reports that the GPU inlet temperature is nearing the thermal throttle limit despite low ambient room temperatures, what is the most likely physical configuration error within the server rack?

A. The server is missing blanking panels in the rack, causing hot air recirculation from the hot aisle back into the cold aisle.

B. The TPM is not initialized correctly, which prevents the motherboard fans from reaching their maximum RPM setpoints during high workloads.

C. The storage array is connected via SAS instead of NVMe, significantly increasing the heat density of the server chassis and blocking airflow.

D. The GPU-based servers are configured with the wrong IP addresses in the OOB management network preventing proper fan speed control.

Answer: A

Explanation:

A. The server is missing blanking panels in the rack, causing hot air recirculation from the hot aisle back into the cold aisle.

The NCP-AII curriculum emphasizes “Data Center Hygiene“ as a prerequisite for stable AI infrastructure. Blanking panels are essential for maintaining the Hot-Aisle/Cold-Aisle containment model. Without them, the high-pressure hot air exhausted from the rear of the servers can leak back through empty rack spaces into the front (cold) aisle. This “recirculation“ increases the temperature of the air entering the GPU inlets. Even if the room‘s ambient temperature is low, the localized air at the server face becomes hot enough to trigger the H100s thermal protection mechanisms, leading to reduced clock speeds (throttling).

Incorrect:

B. The TPM is not initialized correctly, which prevents the motherboard fans from reaching their maximum RPM setpoints during high workloads.

The Trusted Platform Module (TPM) is a security chip used for hardware-based root of trust and encryption. It has no functional link to the servers thermal management system or the Pulse Width Modulation (PWM) signals that control fan speed. Fan profiles are typically managed by the BMC (Baseboard Management Controller) based on temperature sensors, not security module states.

C. The storage array is connected via SAS instead of NVMe, significantly increasing the heat density of the server chassis and blocking airflow.

While the type of storage affects data throughput, it is not a primary driver of the “thermal throttle limit“ for GPUs in an NVIDIA-Certified System. Furthermore, modern NVMe drives often generate more heat than traditional SAS drives due to their higher performance. Regardless, the choice of storage interface does not fundamentally block the massive airflow required by H100 GPUs unless the physical cabling is so poorly managed that it violates the server‘s internal airflow design.

D. The GPU-based servers are configured with the wrong IP addresses in the OOB management network preventing proper fan speed control.

The Out-of-Band (OOB) management network (BMC) handles fan speed control internally via its own firmware and onboard sensors. While a wrong IP address would prevent the administrator from remotely logging into the BMC to view logs or change settings, it does not stop the BMC from automatically increasing fan speeds when it detects rising GPU temperatures. The fan control logic is autonomous to the server and does not rely on network connectivity.

Question 3:

A Linux administrator is installing the NVIDIA Container Toolkit on a fresh Ubuntu installation to support Docker-based AI workloads. After installing the package, what is the mandatory next step to ensure the Docker daemon can utilize the NVIDIA GPU runtime?

A. Recompile the Linux kernel with the CUDA_SUPPORT=yes flag and reboot the machine.

B. Run the ‘nvidia-smi --factory-reset‘ command to clear the GPU state for container consumption.

C. Edit the /etc/docker/daemon.json file to set the ‘default-runtime‘ to ‘nvidia‘ and restart the Docker service.

D. Install the DOCA SDK on the host and map the GPU via a virtual PCIe switch to the Docker container.

Answer: C

Explanation:

C. Edit the /etc/docker/daemon.json file to set the ‘default-runtime‘ to ‘nvidia‘ and restart the Docker service.

The NCP-AII curriculum specifies that simply installing the nvidia-container-toolkit package is insufficient. The Docker engine must be explicitly instructed to use the NVIDIA Container Runtime (a thin wrapper around runc). By modifying the daemon.json configuration file, the administrator registers the nvidia runtime. Setting it as the default-runtime ensures that any container started by the daemon has access to the GPU libraries and binaries (like nvidia-smi) without needing to manually pass the –runtime=nvidia flag every time. A restart of the Docker service is mandatory for these configuration changes to take effect.

Incorrect:

A. Recompile the Linux kernel with the CUDA_SUPPORT=yes flag and reboot the machine.

The NVIDIA driver and Container Toolkit are designed to work with standard, distribution-provided Linux kernels (such as the generic Ubuntu kernel). There is no “CUDA_SUPPORT“ flag in the Linux kernel source, and recompiling the kernel is not part of the NVIDIA-Certified installation workflow. The NCP-AII exam focuses on the installation of the NVIDIA Driver kernel modules, which are built against the existing kernel, rather than replacing the kernel itself.

B. Run the ‘nvidia-smi –factory-reset‘ command to clear the GPU state for container consumption.

The factory-reset command is a troubleshooting tool used to revert GPU settings (like power limits or clock offsets) to their original state. It has no functional role in the “Software Stack“ installation process and does not enable Docker to communicate with the GPU. Clearing the GPU state does not solve the configuration requirement between the Docker daemon and the NVIDIA runtime.

D. Install the DOCA SDK on the host and map the GPU via a virtual PCIe switch to the Docker container.

This option confuses DPU (Data Processing Unit) management with GPU containerization. The DOCA SDK is used for programming BlueField DPUs. While DPUs can use PCIe switches to manage traffic, standard Docker-based AI workloads on a host GPU do not require “virtual PCIe mapping.“ They rely on the NVIDIA Container Runtime to mount the necessary character devices (like /dev/nvidia0) into the container‘s namespace.

Question 4:

When troubleshooting storage performance for an AI factory, an administrator notices that the GPU utilization is low during training and the iowait metric on the compute nodes is high. What is the most effective optimization to resolve this storage bottleneck?

A. Change the training algorithm from a parallel approach to a sequential approach to reduce the number of simultaneous read requests.

B. Implement NVIDIA GPUDirect Storage (GDS) to enable a direct data path between the storage and the GPU memory, bypassing the CPU.

C. Reduce the resolution of the training images so that the storage system has less data to read from the disks during each epoch.

D. Add more GPUs to each node to increase the total amount of compute power available to process the slow-moving data.

Answer: B

Explanation:

B. Implement NVIDIA GPUDirect Storage (GDS) to enable a direct data path between the storage and the GPU memory, bypassing the CPU.

In the standard I/O path, data must be copied from storage into a “bounce buffer“ in CPU system memory before being copied again to the GPU. This consumes CPU cycles and increases latency, leading to the iowait spikes seen in the scenario. GPUDirect Storage (GDS) is the definitive optimization taught in the NCP-AII track for this problem. It uses Direct Memory Access (DMA) to move data directly from the storage interface (like NVMe or NVMe-oF) to the GPU memory. This bypasses the CPU completely, drastically reducing iowait, lowering latency, and allowing the GPUs to reach maximum utilization.

Incorrect:

A. Change the training algorithm from a parallel approach to a sequential approach to reduce the number of simultaneous read requests.

Switching to a sequential approach is antithetical to AI infrastructure goals. Parallelism is the fundamental strength of NVIDIA GPUs. Reducing the number of simultaneous requests might lower the iowait metric, but it would do so by making the training process significantly slower, which is a failure of optimization in an AI Factory context.

C. Reduce the resolution of the training images so that the storage system has less data to read from the disks during each epoch.

While reducing data volume technically lessens the load on storage, it is a workload compromise, not an infrastructure optimization. In the NCP-AII framework, the goal is to build an infrastructure capable of handling the researcher‘s requirements. Changing the science (reducing image resolution) to fit a poorly configured system is not the correct administrative response to a hardware/software bottleneck.

D. Add more GPUs to each node to increase the total amount of compute power available to process the slow-moving data.

Adding more GPUs to a system already suffering from a storage bottleneck will actually worsen the problem. More GPUs create even more demand for data, which would increase the pressure on the already struggling storage path and CPU. The NCP-AII curriculum teaches that you must solve the “starvation“ issue at the source (I/O) before scaling compute.

Question 5:

A ‘burn-in‘ test is being conducted on a new AI cluster using NVIDIA NeMo. Why is a model-specific burn-in test like NeMo preferred over a simple synthetic stress test when validating an AI factory for production use?

A. It automatically repairs any physical layer cable faults by using the BlueField-3‘s ARM cores to re-route traffic through the NVLink fabric.

B. It is required to activate the permanent hardware warranty on the H100 GPUs by registering the burn-in results with the NVIDIA SMI registry.

C. It is the only way to verify that the BMC can successfully communicate with the NVIDIA GPU Cloud to download the latest firmware updates.

D. It stresses the specific communication patterns and memory access behaviors typical of real-world Large Language Model (LLM) training workloads.

Answer: D

Explanation:

D. It stresses the specific communication patterns and memory access behaviors typical of real-world Large Language Model (LLM) training workloads.

An AI Factory designed for production must be validated against the actual workloads it will run. Synthetic stress tests often only push power consumption or local GPU compute. A NeMo-based burn-in utilizes the NVIDIA Collective Communications Library (NCCL) to perform “All-Reduce“ and “All-to-All“ operations. This stresses the NVLink and InfiniBand/Ethernet fabrics, GPUDirect RDMA, and high-bandwidth memory (HBM) in ways that synthetic tests cannot, ensuring the cluster is stable for distributed LLM training.

Incorrect:

A. It automatically repairs any physical layer cable faults by using the BlueField-3‘s ARM cores to re-route traffic through the NVLink fabric.

This is technically inaccurate. While BlueField-3 DPUs have ARM cores for offloading infrastructure tasks, they do not “repair“ physical cable faults. Furthermore, NVLink and the external cluster fabric (InfiniBand/Ethernet) are distinct; traffic cannot simply be re-routed from one to the other to bypass a broken physical cable.

B. It is required to activate the permanent hardware warranty on the H100 GPUs by registering the burn-in results with the NVIDIA SMI registry.

Hardware warranties are associated with the purchase and registration of the physical units, not the performance of a specific software burn-in test. While nvidia-smi is used to monitor health, there is no requirement to “register results“ with a registry to activate a warranty.

C. It is the only way to verify that the BMC can successfully communicate with the NVIDIA GPU Cloud to download the latest firmware updates.

The Baseboard Management Controller (BMC) handles out-of-band management and firmware updates independently of high-level AI frameworks like NeMo. Verifying BMC connectivity is a basic networking step that does not require a model-specific stress test.

For a full set of 360 questions. Go to

https://skillcertpro.com/product/nvidia-ai-infrastructure-ncp-aii-exam-questions/

SkillCertPro offers detailed explanations to each question which helps to understand the concepts better.

It is recommended to score above 85% in SkillCertPro exams before attempting a real exam.
SkillCertPro updates exam questions every 2 weeks.
You will get life time access and life time free updates
SkillCertPro assures 100% pass guarantee in first attempt.

Question 6:

A system administrator is using NVIDIA Base Command Manager (BCM) to deploy an OS image across a new cluster of 64 nodes. The administrator needs to ensure that the Slurm scheduler is properly integrated and that the Enroot and Pyxis plugins are installed. What is the specific function of the Pyxis plugin in this AI infrastructure environment?

A. It acts as a distributed file system for storing large datasets

B. It provides a graphical user interface for monitoring GPU temperatures

C. It manages the power cycling of the GPU nodes via the BMC

D. It enables Slurm to launch containerized workloads using Enroot

Answer: D

Explanation:

D. It enables Slurm to launch containerized workloads using Enroot.

In a high-performance AI cluster managed by NVIDIA Base Command Manager (BCM), containers are the standard for reproducibility. Enroot is NVIDIAs tool for turning container images into unprivileged sandboxes, while Pyxis is the specific Slurm SPANK plugin that allows users to run these containers directly via Slurm commands (e.g., srun –container-image=…). Without Pyxis, Slurm would not have the native awareness to invoke Enroot to set up the container environment for a job.

Incorrect:

A. It acts as a distributed file system for storing large datasets.

Pyxis is a scheduler plugin, not a storage solution. In an NVIDIA AI Factory, distributed file systems are typically handled by technologies like Lustre, IBM Spectrum Scale (GPFS), or WekaIO, which provide the high-throughput data access required for training.

B. It provides a graphical user interface for monitoring GPU temperatures.

Monitoring and telemetry in an NVIDIA cluster are handled by the NVIDIA Data Center GPU Manager (DCGM) and visualized through tools like Grafana or the Base Command Manager dashboard. Pyxis operates at the command-line/scheduler level and does not provide a GUI.

C. It manages the power cycling of the GPU nodes via the BMC.

Power management and hardware orchestration are functions of the Base Command Manager (BCM) itself, communicating with the Baseboard Management Controller (BMC) via protocols like IPMI or Redfish. Pyxis is strictly focused on container integration within the job scheduler.

Question 7:

During the final verification phase of an AI factory deployment, the team executes a High-Performance Linpack (HPL) test. The results show a significant Rmax value drop compared to the Rpeak theoretical performance. Which cluster-level assessment tool is best suited for identifying if the issue is a specific limping node or a general network congestion issue?

A. The DOCA Benchmarking tool; it isolates the DPU performance from the GPU performance to check for CPU bottlenecks.

B. The Slurm squeue command; it identifies which jobs are pending and allows the administrator to prioritize the HPL task.

C. ClusterKit; it performs multifaceted node assessments and identifies outliers in performance across the entire cluster.

D. The ping utility; it checks for basic ICMP connectivity between the head node and the compute nodes.

Answer: C

Explanation:

C. ClusterKit; it performs multifaceted node assessments and identifies outliers in performance across the entire cluster.

When a large-scale test like High-Performance Linpack (HPL) underperforms, the cause is often a “limping node“a single node that is technically functional but running slower than its peers (due to thermal throttling, memory errors, or PCIe issues). ClusterKit is the primary tool used in NVIDIA AI Factory deployments to run sub-tests (like bandwidth and compute benchmarks) across all nodes simultaneously. It automatically aggregates results to highlight which specific nodes are statistical outliers, allowing administrators to isolate hardware issues from general network congestion.

Incorrect:

A. The DOCA Benchmarking tool; it isolates the DPU performance from the GPU performance to check for CPU bottlenecks.

While NVIDIA DOCA is used for DPU (Data Processing Unit) acceleration and management, the DOCA benchmarking suite is focused on network offload and storage performance. It is not the primary tool for diagnosing GPU-heavy computational drops in an HPL test, which primarily stresses the Tensor Cores and NVLink fabric.

B. The Slurm squeue command; it identifies which jobs are pending and allows the administrator to prioritize the HPL task.

The squeue command is a basic job management utility that shows the status of the job queue. It provides zero insight into the performance metrics or hardware health of the nodes running the HPL task. It can tell you that a job is running, but not how well it is performing.

D. The ping utility; it checks for basic ICMP connectivity between the head node and the compute nodes.

Ping only verifies that a node is reachable at the network layer (Layer 3). HPL performance issues are usually related to high-bandwidth interconnects (InfiniBand/NVLink) or floating-point compute efficiency. A node can “ping“ perfectly fine while still having a faulty GPU or a degraded 200Gbps network link that is causing a massive drop in Rmax.

Question 8:

A system administrator needs to optimize an NVIDIA BlueField network platform to handle intensive data movement for a large-scale AI cluster. Which configuration step is necessary to enable the DPU to perform offloaded hardware acceleration for InfiniBand or Ethernet traffic in a production environment?

A. Utilize the NVIDIA SMI tool to flash the BlueField firmware directly onto the HGX baseboard to unify the management of the network and compute layers.

B. Set the MIG profile to 1g.10gb on the BlueField DPU to ensure that the network traffic is partitioned into small, manageable virtual streams for the GPU.

C. Disable the internal ARM cores on the BlueField DPU to allow the host CPU to take over the network steering logic for better AI workload synchronization.

D. Configure the DPU in DPU-Mode (rather than Separated-Mode) and ensure the correct DOCA runtime environment is provisioned to manage the acceleration engines.

Answer: D

Explanation:

D. Configure the DPU in DPU-Mode (rather than Separated-Mode) and ensure the correct DOCA runtime environment is provisioned to manage the acceleration engines.

In an AI Factory, the BlueField DPU must be set to DPU-Mode (also known as Embedded Function Promotion) to act as an independent compute node that manages its own network stack and security policies. In this mode, the DPUs ARM cores run an OS (typically Ubuntu) and use the NVIDIA DOCA (Data Center Infrastructure-on-a-Chip Architecture) framework to offload tasks like data encryption, storage virtualization, and network telemetry from the host CPU and GPU.

Incorrect:

A. Utilize the NVIDIA SMI tool to flash the BlueField firmware directly onto the HGX baseboard to unify the management of the network and compute layers.

The NVIDIA SMI (nvidia-smi) tool is primarily for GPU management. BlueField DPUs have their own firmware and management tools (like mstflint or bfcfg). Furthermore, DPU firmware is flashed to the DPUs own flash memory, not the HGX baseboard, which is a separate physical component for GPU interconnectivity.

B. Set the MIG profile to 1g.10gb on the BlueField DPU to ensure that the network traffic is partitioned into small, manageable virtual streams for the GPU.

MIG (Multi-Instance GPU) is a feature specific to NVIDIA GPUs (like the A100 or H100) that allows a single physical GPU to be partitioned into multiple hardware-isolated instances. It does not apply to BlueField DPUs. DPU traffic isolation is typically handled through SR-IOV (Single Root I/O Virtualization) or VirtIO, not MIG profiles.

C. Disable the internal ARM cores on the BlueField DPU to allow the host CPU to take over the network steering logic for better AI workload synchronization.

The entire purpose of a DPU is to offload processing from the host CPU. Disabling the ARM cores would effectively turn the DPU into a standard NIC (Network Interface Card) or “dumb“ adapter, defeating the purpose of the BlueField platform in a high-scale AI environment where host CPU cycles are needed for other management tasks.

Question 9:

An administrator is installing Base Command Manager (BCM) to orchestrate a new AI cluster. During the setup of the head node, they must configure High Availability (HA). What is the primary mechanism BCM uses to ensure the cluster remains operational if the primary head node suffers a catastrophic hardware failure?

A. BCM requires the administrator to manually copy the Slurm configuration to a USB drive and plug it into a different server whenever the primary node fails.

B. BCM configures a secondary head node that synchronizes its database and configuration files with the primary; it uses a heartbeat mechanism to trigger an automatic failover.

C. BCM uses a round-robin DNS strategy to distribute Slurm job requests to all compute nodes simultaneously, bypassing the need for a management node.

D. BCM utilizes the GPU‘s NVLink interconnect to mirror the entire operating system of the head node onto the first compute node in the cluster.

Answer: B

Explanation:

B. BCM configures a secondary head node that synchronizes its database and configuration files with the primary; it uses a heartbeat mechanism to trigger an automatic failover.

High Availability in Base Command Manager (BCM) is achieved by deploying a pair of head nodes. The primary head node continuously synchronizes its internal database, configuration files, and software repositories with the secondary node. A heartbeat (typically over a dedicated management network) monitors the health of the primary. If the heartbeat is lost, the secondary node automatically assumes the “active“ role, taking over the cluster‘s virtual IP (VIP) and management services to ensure zero-to-minimal downtime for the AI factory.

Incorrect:

A. BCM requires the administrator to manually copy the Slurm configuration to a USB drive and plug it into a different server whenever the primary node fails.

This describes a manual recovery process, not a High Availability mechanism. Modern AI infrastructure requires automated failover. BCM is designed to handle synchronization and state transitions programmatically without physical media intervention.

C. BCM uses a round-robin DNS strategy to distribute Slurm job requests to all compute nodes simultaneously, bypassing the need for a management node.

Round-robin DNS is a load-balancing technique for web traffic, not a cluster management HA strategy. A management node (Head Node) is essential in a Slurm environment to act as the central controller (slurmctld). Without a head node or a functional HA pair, the scheduler cannot manage resource allocations or job queues.

D. BCM utilizes the GPU‘s NVLink interconnect to mirror the entire operating system of the head node onto the first compute node in the cluster.

NVLink is a high-speed, point-to-point interconnect designed for GPU-to-GPU data transfers during model training; it is not used for operating system mirroring or cluster management tasks. Additionally, head nodes are typically CPU-heavy management servers and often do not even contain the GPUs necessary for an NVLink fabric connection.

Question 10:

When installing Base Command Manager (BCM) as the control plane for an AI cluster, the administrator must configure High Availability (HA) for the head node. What is the primary reason for establishing a secondary head node in a BCM environment, and how does the system typically handle a failure of the primary node?

A. The secondary node provides redundancy for the cluster management database and services, using a heartbeat mechanism to trigger an automatic failover.

B. The secondary node acts as a backup storage server that only turns on when the primary node runs out of disk space for user home directories.

C. HA is used to allow the administrator to run two different versions of the operating system simultaneously for testing purposes without affecting users.

D. The secondary node is used to double the compute power of the cluster by sharing the scheduling load with the primary node during peak hours.

Answer: A

Explanation:

A. The secondary node provides redundancy for the cluster management database and services, using a heartbeat mechanism to trigger an automatic failover.

The NCP-AII blueprint specifies that an HA configuration involves two head nodes: a Primary and a Secondary. BCM uses a heartbeat mechanism to constantly monitor the health of the primary node. If the primary node‘s management daemon (cmdaemon) or the physical hardware fails, the secondary node detects the loss of the heartbeat and automatically takes over the cluster‘s Virtual IP (VIP) and management services. This ensures that the MariaDB/MySQL management database remains synchronized and that compute nodes can continue to communicate with a controller without manual intervention.

Incorrect:

B. The secondary node acts as a backup storage server that only turns on when the primary node runs out of disk space for user home directories.

In an NVIDIA-Certified system, user data and “home“ directories are typically stored on high-performance third-party storage (like DDN, NetApp, or VAST) or a dedicated storage tier, not on the head node‘s local disks. The head node is a control plane device, and the HA secondary node is kept in a “hot-standby“ or active state, not a “storage-triggered“ power-on state.

C. HA is used to allow the administrator to run two different versions of the operating system simultaneously for testing purposes without affecting users.

High Availability requires configuration symmetry. For a failover to be successful, both the primary and secondary head nodes must run identical versions of the BCM software and the underlying operating system. Running mismatched versions would lead to database corruption or service incompatibility during a failover event, which is the opposite of the “Reliability“ goal taught in the NCP-AII course.

D. The secondary node is used to double the compute power of the cluster by sharing the scheduling load with the primary node during peak hours.

This confuses High Availability with “Load Balancing.“ While some BCM components can be distributed, the primary purpose of the HA secondary node is Redundancy, not performance scaling. The secondary node does not actively schedule jobs alongside the primary; it waits to take over the primary‘s duties only in the event of a failure to ensure cluster persistence.

For a full set of 360 questions. Go to

https://skillcertpro.com/product/nvidia-ai-infrastructure-ncp-aii-exam-questions/

SkillCertPro offers detailed explanations to each question which helps to understand the concepts better.
It is recommended to score above 85% in SkillCertPro exams before attempting a real exam.
SkillCertPro updates exam questions every 2 weeks.
You will get life time access and life time free updates
SkillCertPro assures 100% pass guarantee in first attempt.

Google Sites

Report abuse