Loading ⏳

Fetching content, this won’t take long…

💡 Did you know?

🦈 Sharks existed before trees 🌳.

Loading ⏳

Fetching content, this won’t take long…

💡 Did you know?

🦥 Sloths can hold their breath longer than dolphins 🐬.

AI-GenAI

AI-AgenticAI

AI-DeepLearning

AI-GenAI

AI-Infrastructure

AI-Machine-Learning

AI-Math

AWS

Azure

Hobbies

kubernetes

Management

Programming

Terraform

Z_Appendix

0-root

AI-GenAI

NVIDIA Certified Associate Generative AI (NCA-GENL) Practice Questions

Practice questions and explanations for the NVIDIA Certified Associate Generative AI (NCA-GENL) certification exam, covering LLMs, transformers, embeddings, vector databases, prompt engineering, AI infrastructure, responsible AI, and generative AI fundamentals.

Generative AI

NVIDIA

NCA-GENL

LLM

Transformers

← Previous

Diffusion Models Explained

Pinned Memory (Page-Locked Memory) in CUDA and GPU Computing

NVIDIA Certified Associate Generative AI (NCA-GENL) Practice Questions

Question 1: Which of the following best describes Word2Vec?

a) A programming language used to build artificial intelligence models.
b) A statistical technique used to analyze word frequency in a text corpus.
c) A deep learning algorithm used to generate word embeddings from text data.
d) A database management system designed for storing and querying word data.

Question 2: In the context of language models, what does an auto regressive model predict?

a) The probability of the next token in a text given the previous tokens.
b) The probability of the next token using a Monte Carlo sampling of past tokens.
c) The next token solely using recurrent network or LSTM cells.
d) The probability of the next token by looking at the previous and future input tokens.

Question 3: In large language models, what is the purpose of the attention mechanism?

a) To measure the importance of the words in the output sequence.
b) To determine the order in which words are generated.
c) To capture the order of the words in the input sequence.
d) To assign weights to each word in the input sequence.

Question 4: In the transformer architecture, what is the purpose of positional encoding?

a) To remove redundant information from the input sequence.
b) To encode the semantic meaning of each token in the input sequence.
c) To add information about the order of each token in the input sequence.
d) To encode the importance of each token in the input sequence.

Question 5: In the field of machine learning, which of the following serves as the fundamental basis for enabling a model to learn from data?

a) Algorithms and statistical methods.
b) The "Attention is All You Need" research paper.
c) The transformer architecture.
d) Large-scale data sets.

Question 6: In a standard machine learning workflow, which key step logically fits between the training and evaluation phases?

a) Inference.
b) Data collection.
c) Model deployment.
d) Feature engineering.

Question 7: To enhance the transparency of a large language model and mitigate the blackbox problem, which technique is most effective?

a) Implementing Retrieval Augmented Generation (RAG) to connect the model to verifiable external knowledge sources.
b) Relying solely on topical guardrails to limit the model's conversation scope.
c) Publishing all models and software on a platform like NVIDIA's NGC.
d) Increasing the size of the training data set to improve statistical accuracy.

Question 8: When validating an AI system that processes sensitive user data, what is a crucial step for building trust and ensuring security and reliability?

a) Achieving compliance with a recognized standard like ISO/IEC 27001 for information security management.,,
b) Using the NVIDIA TAO toolkit with transfer learning.
c) Deploying the application using Helm charts for Kubernetes management.
d) Maximizing the model's F1 score.

Question 9: Which library is specifically designed and optimized for linguistic processing tasks like tokenization, lemmatization, and part-of-speech (POS) tagging?

a) spaCy.
b) pandas.
c) scikit-learn.
d) Matplotlib.

Question 10: Which library is most suitable to accelerate a pandas workflow on an NVIDIA GPU with minimal changes to existing code?

a) cuDF.
b) NumPy.
c) Apache Spark.
d) Dask.

Question 11: Which applications and corresponding evaluation metrics are appropriate for comparing generated text against human-created reference texts? (Choose two)

a) Application: image classification; Metric: accuracy.
b) Application: machine translation; Metric: BLEU.
c) Application: sentiment analysis; Metric: F1 score.
d) Application: text summarization; Metric: ROUGE.

Question 12: Which learning technique describes a model identifying a new class (e.g., an okapi) it has never seen during training, using only a text description?

a) One-shot learning.
b) Reinforcement learning.
c) Zero-shot learning.
d) Supervised learning.

Question 13: What are the primary advantages of using a multi-GPU configuration compared to a single GPU? (Choose two)

a) It inherently increases the final predictive accuracy of the model.
b) It reduces the time required for the training process.
c) It eliminates the need for data pre-processing.
d) It enables the training of larger models that exceed the memory capacity of a single GPU.

Question 14: If a real-time AI service has a bottleneck in the data encoding step, what is the most direct outcome of accelerating this step with a GPU?

a) Predictive accuracy will increase.
b) The model will require less training data.
c) Operational costs will be completely eliminated.
d) The overall latency of the service will decrease.

Question 15: Within the transformer architecture, what specific component is introduced to provide the model with information about word order and position?

a) The feed-forward network.
b) Word embeddings.
c) Positional encoding.
d) Attention mechanism.

Question 16: Which NVIDIA framework is specifically designed to facilitate building, training, and customizing large language models (LLMs)?

a) NVIDIA Triton Inference Server.
b) NVIDIA TensorRT.
c) NVIDIA NeMo Framework.
d) NVIDIA cuDF.

Question 17: Which NVIDIA SDK is specifically designed for deploying high-performance conversational AI services like speech-to-text and text-to-speech?

a) NVIDIA Omniverse.
b) NVIDIA NeMo Framework.
c) NVIDIA cuDF.
d) NVIDIA Riva.

Question 18: Which NVIDIA tool is a scalable solution designed to deploy and manage multiple trained AI models regardless of their original framework?

a) NVIDIA NeMo Framework.
b) NVIDIA Triton Inference Server.
c) NVIDIA TAO Toolkit.
d) NVIDIA TensorRT.

Question 19: Which open-source toolkit from NVIDIA is designed to add programmable safety controls (e.g., preventing prompt injections) to a chatbot?

a) NVIDIA Triton Inference Server.
b) NVIDIA TensorRT.
c) NeMo Guardrails.
d) NVIDIA cuML.

Question 20: Which embedding model is designed to generate contextualized word representations, allowing it to differentiate meanings of the same word based on surrounding context?

a) Word2Vec.
b) TF-IDF.
c) One-hot encoding.
d) BERT.

Question 21: Which of the following is a tree-based ensemble algorithm widely recognized for its high performance and efficiency in prediction tasks using structured tabular data?

a) k-means clustering
b) A large-scale transformer model
c) XGBoost
d) Linear regression

Question 22: When evaluating the performance of a language model, what does a lower perplexity score signify?

a) The model has a larger number of parameters.
b) The model is more surprised by the test data.
c) The model was trained on a smaller data set.
d) The model is more confident and accurate in its predictions.

Question 23: In the context of natural language processing, what is a key distinction between the perplexity metric and the BLEU score?

a) A lower score is better for BLEU while a higher score is better for perplexity.
b) Perplexity is used for image classification while BLEU is used for machine translation.
c) Perplexity evaluates how well a language model predicts a sequence of text, while BLEU compares a model's generated output against a human reference.
d) Both metrics are identical and can be used interchangeably to evaluate any NLP task.

Question 24: To mitigate hallucinations and ground a large language model's responses in verifiable external facts, a Retrieval-Augmented Generation (RAG) system is implemented. Which of the following components are essential and work together to achieve this? (Choose two)

a) The generator, which is an LLM that synthesizes the final answer based on the augmented prompt.
b) The fine-tuning module, which continuously retrains the LLM on new data.
c) The retriever, which searches a knowledge base for relevant information to augment the prompt.
d) The output validator, which checks the generated text for grammatical errors.

Question 25: Which NVIDIA platform is specifically designed for an end-to-end GPU-accelerated data science pipeline, offering a familiar Python API similar to pandas and scikit-learn?

a) Apache Spark
b) Matplotlib
c) NVIDIA RAPIDS
d) NVIDIA NeMo Framework

Question 26: In the initial phase of a data science project, what are the primary objectives of Exploratory Data Analysis (EDA)? (Choose two)

a) To deploy the final machine learning model into a production environment.
b) To identify potential anomalies, errors, and missing values within the data set.
c) To select the final hyperparameters for the model through grid search.
d) To summarize the data's main characteristics and uncover initial patterns and relationships.

Question 27: The core mechanism of a diffusion model involves two distinct phases. What are these two phases?

a) A compression phase (downsizing) and a rendering phase (upscaling).
b) A forward process where noise is progressively added to an image and a reverse process where a model learns to denoise it.
c) A feature extraction phase using a convolutional network and a classification phase using a feed-forward network.
d) A generator phase that creates an image and a discriminator phase that validates its authenticity.

Question 28: Which NVIDIA SDK is specifically designed to maximize throughput and minimize latency during inference on NVIDIA GPUs without changing the model's architecture?

a) NVIDIA Triton Inference Server
b) NVIDIA NeMo Framework
c) NVIDIA TensorRT
d) scikit-learn

Question 29: Which open standard should be used to represent a machine learning model to ensure interoperability, allowing a model trained in one framework (like PyTorch) to be used in another (like TensorFlow)?

a) NVIDIA TensorRT
b) ONNX (Open Neural Network Exchange)
c) A Docker container
d) A Python pickle file

Question 30: In a neural network, what are the primary roles of an activation function? (Choose two)

a) To introduce nonlinear properties, allowing the network to learn complex data patterns.
b) To group the input data into distinct clusters similar to the k-means algorithm.
c) To determine the output signal of a neuron, deciding whether it should be activated based on its weighted input.
d) To normalize the input data to have a mean of zero and a standard deviation of one.
e) To directly calculate and apply the weight and bias updates during back propagation.

Question 31: Which of the following algorithms is most suitable for an unsupervised clustering task where the data set does not contain any pre-existing labels?

a) Linear regression
b) A classification algorithm like Support Vector Machine (SVM)
c) Generative Adversarial Network (GAN)
d) k-means clustering

Question 32: Which natural language processing (NLP) task is specifically designed to extract and categorize specific pieces of information such as company names, executive names, and monetary values?

a) Sentiment analysis
b) Text summarization
c) Machine translation
d) Named Entity Recognition (NER)

Question 33: When optimizing a deep learning model for deployment, what are the primary advantages of applying model quantization? (Choose two)

a) A significant increase in the model's predictive accuracy.
b) Faster inference speed due to the use of lower-precision integer arithmetic.
c) A reduced memory footprint and smaller model file size.
d) It simplifies the process of data collection and pre-processing.
e) It allows the model to be trained with much less data.

Question 34: The process of model quantization often involves converting weights from FP32 to int8. What are the direct consequences of this process? (Choose three)

a) Increased computational throughput and lower inference latency.
b) A reduction in the numerical precision of the model's parameters.
c) A guaranteed improvement in the model's predictive accuracy.
d) Smaller model size and reduced memory (RAM/VRAM) requirements.
e) A significant increase in the time required to train the model.

Question 35: To evaluate a binary classification model's performance by providing a balance between precision and recall, which of the following metrics is most suitable?

a) BLEU score
b) Perplexity
c) Mean Absolute Error (MAE)
d) F1 score

Question 36: Why might accuracy be a poor or misleading metric for evaluating a model's performance on an imbalanced data set (e.g., a fraud detection task where only 1% of cases are fraud)?

a) Accuracy is only used for regression tasks, not classification.
b) A model that simply predicts "not fraud" for every transaction would also achieve 99% accuracy.
c) 99% accuracy is generally considered too low for financial applications.
d) The F1 score must be calculated first to determine the accuracy.

Question 37: Which technology enables the partitioning of a single physical GPU into multiple independent, securely isolated instances with guaranteed portions of compute and memory?

a) NVLink
b) Data parallelism
c) NVIDIA RAPIDS
d) Multi-Instance GPU (MIG)

Question 38: What is the experimental method called where you randomly show version A (e.g., a blue button) to one group and version B (e.g., a green button) to another and compare the results?

a) k-means clustering
b) Decision tree analysis
c) A/B testing
d) Exploratory Data Analysis (EDA)

Question 39: Which of the following statements accurately describe the characteristics of a single decision tree algorithm? (Choose two)

a) It is an unsupervised learning algorithm used primarily for clustering data without labels.
b) It can be used for both classification and regression tasks.
c) It is a type of deep neural network that uses activation functions and back propagation.
d) It makes predictions by learning a series of simple if-then-else rules from the data features.
e) It operates by building a multitude of trees on random subsets of data and averaging their predictions.

Question 40: What is the text normalization technique called that transforms different forms of a word (like "is," "am," "are") back to their base dictionary form or lemma (like "be")?

a) Stemming
b) Tokenization
c) Named Entity Recognition (NER)
d) Lemmatization

Question 41: Given the sentence "The quick brown fox jumps," what is the correct set of bigrams (2-grams)?

a) the, quick, brown, fox, jumps
b) A similarity score used to compare it with another sentence
c) {the quick, quick brown, brown fox, fox jumps}
d) the quick brown, quick brown fox, brown fox jumps

Question 42: Which of the following statements correctly describe the characteristics and typical use cases of Principal Component Analysis (PCA) and t-Distributed Stochastic Neighbor Embedding (t-SNE)? (Choose two)

a) t-SNE is primarily used for preserving the local structure of the data, making it excellent for visualizing distinct clusters.
b) PCA is a supervised learning technique that requires labeled data to perform dimensionality reduction.
c) Both PCA and t-SNE are guaranteed to find the same patterns and produce nearly identical visualizations.
d) PCA is a linear technique that aims to capture the maximum variance in the data, making it effective for understanding the global data structure.
e) The primary purpose of both techniques is to train a predictive model for classification.

Question 43: In the field of natural language processing (NLP), what is the correct term for a large, structured collection of text documents used for training language models and conducting linguistic research?

a) Hyperparameter
b) Checkpoint
c) Corpus
d) Algorithm

Question 44: What is the primary purpose of the technique known as dropout during the training of neural networks?

a) To introduce nonlinearity into the network, allowing it to learn complex patterns.
b) To permanently remove unimportant neurons after training to reduce the model's file size.
c) To prevent the model from overfitting by making it learn more robust and redundant features.
d) To calculate the gradient of the loss function with respect to the model's weights.

Question 45: While recurrent neural networks (RNNs) are designed for sequential data, in which of the following applications are convolutional neural networks (CNNs) most widely and effectively used?

a) Time series forecasting
b) Image classification and object detection
c) Customer segmentation using k-means
d) Predicting housing prices from a tabular data set

Question 46: What is the key architectural feature of a recurrent neural network (RNN) that enables it to process sequential data?

a) A set of convolutional layers designed to detect spatial patterns.
b) A purely feed-forward structure where information only moves in one direction without loops.
c) The exclusive use of linear activation functions throughout the network.
d) A feedback loop or hidden state that passes information from one time step to the next.

Question 47: During the training of a neural network, what is the primary purpose of the backpropagation algorithm?

a) To perform the forward pass where input data is fed through the network to generate a prediction.
b) To randomly deactivate neurons in a layer to prevent the model from overfitting.
c) To initialize the weights and biases of the network before the training process begins.
d) To calculate the gradient of the loss function with respect to the network's weights, enabling learning.
Question 48: What is the primary purpose of the NVIDIA NGC catalog?
a) An online marketplace for buying and selling third-party software licenses.
b) A source code repository for collaborating on open-source projects similar to GitHub.
c) A cloud computing platform for renting virtual machines equipped with GPUs.
d) A central hub for accessing GPU-accelerated software such as containers, pre-trained models, and SDKs.

Question 49: In the context of training a machine learning model, what is the correct definition of one epoch?

a) A single forward and backward pass of one batch of data through the network.
b) The final evaluation of the model's performance on the test data set.
c) One complete pass through the entire training data set.
d) The process of tuning the model's hyperparameters such as the learning rate.

Question 50: What is the primary role of the NVIDIA Base Command Platform in an AI-powered data center?

a) Hardware interconnect for linking multiple DGX systems with high bandwidth.
b) A command-line interface for monitoring the status of a single NVIDIA GPU.
c) The software suite for managing AI development workflows and orchestrating DGX infrastructure.
d) A library for accelerating data processing tasks similar to pandas but on the GPU.

Question 51: Which software solution is specifically designed to meet enterprise-grade requirements such as long-term stability, robust security, and professional support (SLAs)?

a) Downloading the latest open-source packages directly from public repositories.
b) NVIDIA AI Enterprise
c) A standard cloud virtual machine with a base operating system.
d) A single open-source framework like PyTorch.

Question 52: Which NVIDIA offering is designed to package generative AI models as easy-to-use, scalable microservices with standard APIs to simplify integration?

a) NVIDIA DGX system
b) NVIDIA NIM (NVIDIA Inference Microservices)
c) NVIDIA NeMo Framework
d) NVIDIA Triton Inference Server

Question 53: What is the primary function of the backpropagation algorithm in the training of a neural network such as an RNN?

a) To distribute parts of a model across multiple GPUs for parallel computation.
b) To serve the trained model in a production environment with low latency.
c) To randomly ignore certain neurons during training to prevent overfitting.
d) To enable the network to learn by calculating the gradient of the loss function with respect to the network's weights.

Question 54: Which NVIDIA offering provides a collection of optimized microservices specifically for core RAG tasks, including embedding documents and searching for relevant context?

a) NVIDIA Omniverse
b) NeMo Guardrails
c) NVIDIA NeMo Retriever
d) A custom Python script using scikit-learn

Question 55: Which combination of tools is most appropriate for building a complex generative AI application (the entire end-to-end workflow) and connecting different AI models/tools (agent logic)?

a) Use NVIDIA AI workflows for connecting models and LangChain for building the application.
b) Use NVIDIA AI workflows for both building the application and connecting the models.
c) Use NVIDIA AI workflows for building the application and LangChain for connecting the models.
d) Use LangChain for both building the application and connecting the models.

Question 56: Which of the following components and techniques are key to the parallel processing capability of the transformer architecture? (Choose two)

a) A recurrent feedback loop that processes tokens one by one in a strict sequence.
b) Multi-head attention, which allows different attention heads to be computed independently and in parallel.
c) The model's ability to process all input tokens simultaneously rather than sequentially.
d) A k-means clustering algorithm to group similar tokens before processing.
e) A single attention mechanism that must process all relationships serially.

Question 57: Which technology is specifically designed to protect sensitive data while it is actively being processed in memory during use?

a) Standard software-based encryption
b) A virtual private network (VPN)
c) Confidential Computing
d) A network firewall

Question 58: Which activation function is most suitable for the output layer of a binary classification problem to represent a probability (0 to 1)?

a) ReLU (Rectified Linear Unit)
b) Tanh (Hyperbolic Tangent)
c) Softmax
d) Sigmoid

Question 59: Which of the following is a popular parameter-efficient fine-tuning (PEFT) method that freezes the original model weights and only trains a small number of new adapter parameters?

a) Full fine-tuning
b) k-means clustering
c) LoRA (Low-Rank Adaptation)
d) Post-training quantization (PTQ)

Question 60: What is the primary characteristic that distinguishes generative AI from other types of AI such as predictive or analytical AI?

a) Its ability to cluster unlabeled data into distinct groups.
b) Its ability to classify existing data into predefined categories.
c) Its ability to create new, original content such as text, images, or music that resembles the data it was trained on.
d) Its ability to predict a continuous numerical value based on input features.

Question 61: In the transformer architecture, what is the primary purpose of positional encoding?

a) To capture the semantic meaning of each word in the vocabulary.
b) To calculate the contextual importance of each word relative to others in the sequence.
c) To provide the model with information about the order and position of tokens in a sequence.
d) To apply a final nonlinear transformation to the output of the attention block.

Question 62: While the Word2Vec model is used to learn vector representations for individual words, the Doc2Vec model is an extension of this technique. What is the primary purpose of Doc2Vec?

a) To generate a numerical vector representation for each individual word in the vocabulary.
b) To classify a document into predefined categories like sports or politics.
c) To generate a single numerical vector that represents the semantic meaning of an entire document or paragraph.
d) To create a new synthetically generated document that is stylistically similar to the input.

Question 63: Which of the following options correctly lists the steps of a typical machine learning workflow in the proper sequential order?

a) Model training, model evaluation, data collection, data prep-processing, model deployment.
b) Data collection, model deployment, data prep-processing, model training, model evaluation.
c) Data collection, data prep-processing, model training, model evaluation, model deployment.
d) Data prep-processing, data collection, model training, model deployment, model evaluation.

Question 64: In the process of training a neural network, what is the primary role of the back propagation algorithm?

a) To perform the initial forward pass of data through the network to generate an output.
b) To introduce nonlinearity into a neuron's output.
c) To randomly deactivate a portion of neurons during training to prevent overfitting.
d) To calculate the gradient of the loss function with respect to each weight, enabling the network to learn.

Question 65: What is the primary purpose of a Generative Adversarial Network (GAN)?

a) To classify input data into one of several predefined categories.
b) To generate new synthetic data samples that mimic a real data set.
c) To predict a continuous numerical value based on a set of input features.
d) To group unlabeled data points into clusters based on their similarity.

Question 66: How does an auto regressive language model, such as a GPT-style model, generate text?

a) It analyzes the entire input text at once, considering both past and future words to understand context.
b) It classifies the entire text into a single category such as positive or negative.
c) It generates the next token in a sequence based only on the sequence of tokens that it has generated in previous steps.
d) It groups similar documents together into clusters without using any labels.

Question 67: What is the primary application of the t-SNE (t-distributed Stochastic Neighbor Embedding) algorithm in data science?

a) To train a model to classify data into predefined categories.
b) To normalize the scale of features in a data set before model training.
c) To generate new synthetic data samples that are similar to a real data set.
d) To visualize high-dimensional data in a low-dimensional space such as 2D or 3D to reveal underlying structures and clusters.

Question 68: In the context of evaluating the trustworthiness of an AI model, what is the key difference between reliability and validity?

a) Reliability measures if the model produces consistent results, while validity measures if the model is actually correct and measures what it intends to.
b) Reliability measures the model's accuracy on the training data, while validity measures its accuracy on the test data.
c) Reliability refers to the model's inference speed, while validity refers to its memory usage.
d) Both terms mean the same thing and can be used interchangeably to describe a model's overall accuracy.

Question 69: In a deep learning workflow, data is often loaded on the CPU and then transferred to the GPU for training. When this transfer becomes a bottleneck, which memory optimization technique can significantly improve throughput?

a) Using a larger batch size.
b) Applying data augmentation.
c) Using pinned memory (page-locked memory).
d) Increasing the number of layers in the neural network.

Question 70: What is a key shared characteristic between the XGBoost library and the NVIDIA RAPIDS cuML library?

a) Both are designed exclusively for deep learning and neural network tasks.
b) Both are primarily used for unsupervised learning tasks like clustering.
c) Both can be significantly accelerated using NVIDIA GPUs.
d) Both are core components of the Apache Spark ecosystem for distributed computing.

Question 71: What is the correct term for the phenomenon where an AI model generates a factually incorrect or nonsensical statement but presents it confidently as if it were true?

a) Overfitting.
b) Bias.
c) Hallucination.
d) Quantization error.

Question 72: Select two models or techniques that are used to create numerical vector representations (embeddings) of text. (Choose two)

a) k-means clustering.
b) BERT.
c) Back propagation.
d) Doc2Vec.
e) A standard convolutional neural network (CNN).

Question 73: To adhere to the principle of privacy in trustworthy AI, what is a fundamental practice a company must implement when using personal user data for training?

a) Collect as much user data as possible to maximize model accuracy.
b) Use the data for training without explicitly informing users to speed up development.
c) Provide users with clear information and the ability to consent to or opt out of their data being used.
d) Switch to a more complex model architecture to automatically handle privacy.

Question 74: Which NVIDIA framework provides a comprehensive end-to-end platform that supports techniques ranging from simple prompt engineering to PEFT and full fine-tuning?

a) NVIDIA Triton Inference Server.
b) NVIDIA DGX system.
c) NVIDIA TensorRT.
d) NVIDIA NeMo framework.

Question 75: In the context of AI governance and trustworthy systems, what does certification refer to?

a) The process of training the AI model on a large, diverse data set.
b) An informal code review conducted by the internal development team.
c) The formal process of verifying that an AI system meets established standards for aspects like security, fairness, and reliability.
d) The process of optimizing the model's code for faster inference speed.

Question 76: What is the key feature of the NVIDIA Triton Inference Server that automatically groups incoming real-time requests into larger batches for more efficient GPU processing?

a) Model fine-tuning.
b) Dynamic batching.
c) Load balancing.
d) Post-training quantization.

Question 77: According to the concept of scaling laws in deep learning, what is one of the most reliable ways to predictably improve the performance of a large language model?

a) Changing the activation function from ReLU to sigmoid.
b) Increasing the size of the training data set and the number of model parameters.
c) Applying post-training quantization to the model.
d) Switching to a different optimization algorithm like from Adam to SGD.

Question 78: Which of the following statements best describes the primary roles of ONNX and NVIDIA TensorRT in a machine learning workflow?

a) ONNX is used to train neural networks while TensorRT is used to collect and pre-process data.
b) ONNX is an open format for model interoperability while TensorRT is an SDK for optimizing and deploying models for high-performance inference.
c) Both ONNX and TensorRT are used for the same purpose of training models from scratch.
d) ONNX is an SDK for optimizing model performance while TensorRT is an open format for model interoperability.

Question 79: What are the two primary benefits of applying model quantization to a trained deep learning model? (Choose two)

a) Increased inference throughput.
b) Significantly improved predictive accuracy.
c) Reduced model memory requirements.
d) Simplified data collection process.
e) Reduced model training time.

Question 80: Which of the following metrics is the industry standard for comparing machine-generated translations against several high-quality human-expert translations?

a) Accuracy.
b) Mean Absolute Error (MAE).
c) Bilingual Evaluation Understudy (BLEU) score.
d) Perplexity.

Question 81: A company decides to switch from using a 7 billion parameter language model to a much larger 70 billion parameter model for their application.

According to AI scaling laws, what are the typical consequences of this change? (Choose two)

a) The model will require significantly less data to train.
b) The model's performance and accuracy will likely increase.
c) The model will become less expensive to host and run.
d) The model's inference time latency will likely increase.
e) The model will be easier to deploy on edge devices with limited memory.

Question 82: A user gives a large language model the following instruction: "Translate the following English sentence to French: hello how are you". The user does not provide any examples of English to French translations in the prompt. What is this type of prompting technique called?

a) One-shot prompting
b) Fine-tuning
c) Zero-shot prompting
d) Few-shot prompting

Question 83: In natural language processing, what is the primary goal of using text normalization techniques like stemming and lemmatization?

a) To generate new sentences that are grammatically correct.
b) To classify the sentiment of a text as positive, negative, or neutral.
c) To reduce inflectional forms of a word to a common base or root form.
d) To identify and categorize named entities such as persons, organizations, and locations.

Question 84: A developer is building a production-grade application that requires fast and efficient natural language processing for tasks like tokenization, part-of-speech tagging, and named entity recognition.

Which of the following Python libraries is an industry standard high-performance tool specifically designed for these purposes?

a) pandas
b) Matplotlib
c) spaCy
d) scikit-learn

Question 85: What is the typical relationship between the ONNX format and NVIDIA TensorRT in a model deployment workflow?

a) TensorRT is used to convert a trained model into the ONNX format.
b) A model is often exported to the ONNX format, which is then used as an input for optimization by TensorRT.
c) ONNX and TensorRT are competing standards that are mutually exclusive and cannot be used together.
d) Both ONNX and TensorRT are frameworks used for training neural networks from scratch.

Question 86: Before fine-tuning a pre-trained language model on a new custom data set, why is it a crucial first step to perform exploratory data analysis (EDA)?

a) EDA is the process of deploying the model to a production server.
b) EDA helps to understand the data's quality, identify potential issues like biases or anomalies, and inform the fine-tuning strategy.
c) EDA is a technique used to significantly increase the number of parameters in the model.
d) EDA is an unnecessary step as pre-trained models are already robust enough to handle any type of data.

Question 87: An insurance company uses an AI model to determine premium rates for customers. It is discovered that the model consistently offers higher rates to applicants from certain neighborhoods, even when their individual risk profiles are identical to applicants from other areas.

This is an example of the model violating which key principle of trustworthy AI?
a) Performance
b) Bias/Fairness
c) Reliability
d) Privacy

Question 88: A bank uses a complex deep learning model to approve or deny loan applications. When a customer's application is denied, they ask for the reason; however, the bank is unable to provide a clear explanation because the model's decision-making process is too complex to interpret.

This model lacks which important characteristic of trustworthy AI?

a) Explainability/Transparency
b) Scalability
c) Accuracy
d) Robustness

Question 89: A company deploys an AI-powered image recognition system at the entrance of their building. The system works perfectly under normal lighting conditions but consistently fails to recognize employees when it is a very sunny or a very cloudy day.

This model is failing in which key area of trustworthy AI?

a) Privacy
b) Fairness
c) Robustness
d) Transparency

Question 90: In the context of a trustworthy AI model, which statement correctly defines validity and reliability?

a) Validity refers to how fast the model makes a prediction; reliability refers to how much memory it uses.
b) Validity means the model produces consistent outputs for the same input; reliability means the model's outputs are factually correct.
c) Validity and reliability are the same and both measure the model's overall accuracy.
d) Validity means the model's outputs are factually correct and measure what is intended; reliability means the model produces consistent outputs for the same input.

Question 91: The concept of power law generalization or AI scaling laws describes the relationship between the scale of a model (parameters, data set size, and compute) and its performance. Which statement best describes this relationship?

a) Model performance increases linearly with the amount of training data without any upper limit.
b) Model performance improves predictably as scale increases, but the rate of improvement diminishes.
c) A model reaches its optimal performance with a relatively small amount of data and adding more data provides no benefit.
d) The performance of a model is completely independent of its size or the amount of training data.

Question 92: In distributed data parallel training, each GPU calculates the gradients for its own mini-batch of data. Before the model weights can be updated, these individual gradients must be aggregated and synchronized across all GPUs. What is this collective communication operation called?

a) Broadcast
b) Scatter
c) All-reduce
d) Data augmentation

Question 93: What is the primary purpose of applying model quantization in a deep learning workflow?

a) To increase the predictive accuracy of the model.
b) To accelerate inference speed and reduce the model's memory footprint.
c) To augment the training data set with new synthetic examples.
d) To train a model from scratch with fewer parameters.

Question 94: A user provides the following prompt to a large language model: "Translate English to French: Sea otter = Loutre de mer, Cheese = ". This is an example of which prompting technique?

a) Fine-tuning
b) Few-shot prompting
c) Zero-shot prompting
d) One-shot prompting

Question 95: A developer is building a production-grade application that requires fast and efficient natural language processing for tasks like tokenization, part-of-speech tagging, and named entity recognition.

Which of the following Python libraries is an industry standard high-performance tool specifically designed for these purposes?

a) pandas
b) Matplotlib
c) spaCy
d) scikit-learn

Question 96: What is the primary purpose of the GLUE (General Language Understanding Evaluation) benchmark in the field of natural language processing?

a) To provide a large unlabeled corpus for pre-training language models.
b) To serve as a standardized suite of diverse NLP tasks for evaluating a model's general language understanding capabilities.
c) To be a specific type of model architecture similar to a transformer.
d) To be a software library for deploying language models into production.

Question 97: While both BLEU and ROUGE are metrics used to evaluate generated text by comparing it to a reference, they are optimized for different tasks. Which statement correctly describes their primary use cases?

a) BLEU is for text summarization (recall-focused) while ROUGE is for machine translation (precision-focused).
b) Both metrics are primarily used for evaluating the accuracy of classification models.
c) BLEU is for machine translation (precision-focused) while ROUGE is for text summarization (recall-focused).
d) Both metrics are identical and can be used interchangeably for any text generation task.

Question 98: A user asks a large language model: "Who improved the light bulb after Thomas Edison?" The model confidently replies: "After Thomas Edison, the light bulb was significantly improved by Arthur Clark in 1972, who developed the carbon nanofilament." Even though Arthur Clark is a science fiction writer and this invention is fictional, what is this phenomenon called?

a) Overfitting
b) Bias
c) Hallucination
d) Data poisoning

Question 99: An e-commerce company wants to determine if changing the color of their "add to cart" button from blue to green will increase user clicks. They decide to randomly show the original blue button to one group (Group A) and the new green button to another (Group B) and compare results. What is this experimental method called?

a) k-means clustering
b) Decision tree analysis
c) A/B testing
d) Exploratory Data Analysis (EDA)

Question 100: In text pre-processing, both stemming and lemmatization are used to reduce words to their root form. What is the key difference between these two techniques?

a) Stemming is a more modern technique that always produces more accurate results than lemmatization.
b) Stemming uses dictionary and morphological analysis to find a word's base form while lemmatization uses simple rules.
c) There is no difference; both are identical techniques.
d) Lemmatization uses dictionary and morphological analysis to find a word's base form while stemming uses simple rules to chop off word endings.

Question 101: During the training of a neural network, what is the primary purpose of the back propagation algorithm?

a) To perform the forward pass where input data is fed through the network to generate a prediction.
b) To randomly deactivate neurons in a layer to prevent the model from overfitting.
c) To initialize the weights and biases of the network before the training process begins.
d) To calculate the gradient of the loss function with respect to the network's weights, enabling learning.

Question 102: A data science team wants to accelerate their entire workflow (data loading, manipulation, and machine learning) on GPUs using a familiar Python API similar to pandas and scikit-learn. Which NVIDIA platform is specifically designed for this?

a) Apache Spark
b) Matplotlib
c) NVIDIA RAPIDS
d) NVIDIA NeMo framework

Question 103: When building a neural network for a binary classification task, what are the typical use cases for the sigmoid and ReLU activation functions?

a) Both are interchangeable and can be used in either the hidden layers or the output layer.
b) ReLU is used in the output layer to produce a probability while sigmoid is used in the hidden layers.
c) Sigmoid is used in the output layer to produce a probability while ReLU is used in the hidden layers.
d) Both are primarily used for regression tasks not classification.

Question 104: A company decides to switch from a 7 billion parameter model to a 70 billion parameter model. According to AI scaling laws, what are the typical consequences? (Choose two)

a) The model will require significantly less data to train.
b) The model's performance and accuracy will likely increase.
c) The model will become less expensive to host and run.
d) The model's inference time latency will likely increase.
e) The model will be easier to deploy on edge devices with limited memory.

Question 106: How does the Retrieval-Augmented Generation (RAG) technique primarily enhance the trustworthiness of a large language model's outputs?

a) By encrypting the model's training data to ensure privacy.
b) By grounding the model's responses in a specific verifiable and up-to-date knowledge base, reducing hallucinations.
c) By significantly increasing the number of parameters in the model to make it more intelligent.
d) By automatically detecting and removing biases from the model's original training corpus.

Question 107: What is the primary purpose of the NVIDIA NGC catalog?

a) An online marketplace for buying and selling third-party software licenses.
b) A source code repository for collaborating on open-source projects similar to GitHub.
c) A central hub for accessing GPU-accelerated software such as containers, pre-trained models, and SDKs.
d) A cloud computing platform for renting virtual machines equipped with GPUs.

Question 108: Which NVIDIA framework provides an end-to-end platform that supports customization techniques ranging from prompt engineering to PEFT and full fine-tuning?

a) NVIDIA Triton Inference Server
b) NVIDIA DGX system
c) NVIDIA TensorRT
d) NVIDIA NeMo framework

Question 110: When transferring data from the CPU to the GPU becomes a bottleneck, which memory optimization technique can significantly improve throughput?

a) Using a larger batch size.
b) Applying data augmentation.
c) Using pinned memory (page-locked memory).
d) Increasing the number of layers in the neural network.

Question 111: Which of the following metrics is the industry standard for evaluating machine translation quality by comparing it against human expert translations?

a) Accuracy
b) Mean Absolute Error (MAE)
c) BLEU (Bilingual Evaluation Understudy) score
d) Perplexity

Question 112: Which technique is specifically designed to learn a dense numerical vector representation for each individual word in a vocabulary based on its surrounding context?

a) Doc2Vec
b) Word2Vec
c) Sentiment analysis
d) Tokenization

Question 113: Which option correctly matches the NVIDIA data center GPUs with their respective microarchitectures?

a) A100-Hopper, H100-Volta, V100-Ampere
b) A100-Volta, H100-Ampere, V100-Hopper
c) A100-Ampere, H100-Hopper, V100-Volta
d) A100-Ampere, H100-Volta, V100-Hopper

Question 114: What is the NLP task called that automatically determines whether the opinion in a review is positive, negative, or neutral?

a) Named Entity Recognition (NER)
b) Machine translation
c) Sentiment analysis
d) Text summarization

Question 115: Which emerging technology provides a verifiable "nutrition label" for AI-generated or human-edited content, securely attaching information about its origin and edit history?

a) A/B testing
b) SHA-256 hashing
c) Content Credentials (C2PA standard)
d) Generative Adversarial Networks (GANs)

Question 116: Which statement best describes the flexibility of the NVIDIA Triton Inference Server in handling different models?

a) Triton can only deploy models that were trained using the NVIDIA NeMo framework.
b) Triton can deploy models from virtually any framework, such as TensorFlow, PyTorch, TensorRT, and ONNX.
c) Triton automatically retrains models if their performance degrades over time.
d) Triton is a software library for data processing, not for deploying models.

Question 117: Which performance optimization feature of the NVIDIA Triton Inference Server automatically groups individual requests into a single larger batch?

a) Model versioning
b) Dynamic batching
c) Health monitoring
d) Rate limiting

Question 118: Where does the NVIDIA Triton Inference Server load its models and configurations from?

a) Directly from a public GitHub repository.
b) A local or cloud-based storage location known as the model repository.
c) An integrated development environment (IDE) like VS Code.
d) The NVIDIA NGC catalog exclusively.

Question 119: What is the primary role of the vector database in a Retrieval-Augmented Generation (RAG) system?

a) To store the large language model (LLM) itself.
b) To store and index numerical embeddings of text for efficient similarity search.
c) To fine-tune the large language model on new data.
d) To store the chat history of user conversations.

Question 120: What is the primary function of the RAG technique in the context of large language models?

a) A method for fine-tuning a model by updating all of its weights on a new data set.
b) A technique that enhances a model's responses by first retrieving relevant information from an external knowledge base.
c) A type of hardware accelerator designed to speed up model training.
d) A data encryption standard used to protect the model's training data.

Written by Hitesh Sahu, a passionate developer and blogger.

Tue May 26 2026

Share This on

← Previous

Diffusion Models Explained

Pinned Memory (Page-Locked Memory) in CUDA and GPU Computing

AI-GenAI/Sample-Test

Loading ⏳

Fetching content, this won’t take long…

💡 Did you know?

🦈 Sharks existed before trees 🌳.

AI-GenAI

AI-AgenticAI

AI-DeepLearning

AI-GenAI

AI-Infrastructure

AI-Machine-Learning

AI-Math

AWS

Azure

Hobbies

kubernetes

Management

Programming

Terraform

Z_Appendix

0-root

AI-GenAI

NVIDIA Certified Associate Generative AI (NCA-GENL) Practice Questions

Practice questions and explanations for the NVIDIA Certified Associate Generative AI (NCA-GENL) certification exam, covering LLMs, transformers, embeddings, vector databases, prompt engineering, AI infrastructure, responsible AI, and generative AI fundamentals.

Generative AI

NVIDIA

NCA-GENL

LLM

Transformers

← Previous

Diffusion Models Explained

Pinned Memory (Page-Locked Memory) in CUDA and GPU Computing

NVIDIA Certified Associate Generative AI (NCA-GENL) Practice Questions

Question 1: Which of the following best describes Word2Vec?

a) A programming language used to build artificial intelligence models.
b) A statistical technique used to analyze word frequency in a text corpus.
c) A deep learning algorithm used to generate word embeddings from text data.
d) A database management system designed for storing and querying word data.

Question 2: In the context of language models, what does an auto regressive model predict?

a) The probability of the next token in a text given the previous tokens.
b) The probability of the next token using a Monte Carlo sampling of past tokens.
c) The next token solely using recurrent network or LSTM cells.
d) The probability of the next token by looking at the previous and future input tokens.

Question 3: In large language models, what is the purpose of the attention mechanism?

a) To measure the importance of the words in the output sequence.
b) To determine the order in which words are generated.
c) To capture the order of the words in the input sequence.
d) To assign weights to each word in the input sequence.

Question 4: In the transformer architecture, what is the purpose of positional encoding?

a) To remove redundant information from the input sequence.
b) To encode the semantic meaning of each token in the input sequence.
c) To add information about the order of each token in the input sequence.
d) To encode the importance of each token in the input sequence.

Question 5: In the field of machine learning, which of the following serves as the fundamental basis for enabling a model to learn from data?

a) Algorithms and statistical methods.
b) The "Attention is All You Need" research paper.
c) The transformer architecture.
d) Large-scale data sets.

Question 6: In a standard machine learning workflow, which key step logically fits between the training and evaluation phases?

a) Inference.
b) Data collection.
c) Model deployment.
d) Feature engineering.

Question 7: To enhance the transparency of a large language model and mitigate the blackbox problem, which technique is most effective?

a) Implementing Retrieval Augmented Generation (RAG) to connect the model to verifiable external knowledge sources.
b) Relying solely on topical guardrails to limit the model's conversation scope.
c) Publishing all models and software on a platform like NVIDIA's NGC.
d) Increasing the size of the training data set to improve statistical accuracy.

Question 8: When validating an AI system that processes sensitive user data, what is a crucial step for building trust and ensuring security and reliability?

a) Achieving compliance with a recognized standard like ISO/IEC 27001 for information security management.,,
b) Using the NVIDIA TAO toolkit with transfer learning.
c) Deploying the application using Helm charts for Kubernetes management.
d) Maximizing the model's F1 score.

Question 9: Which library is specifically designed and optimized for linguistic processing tasks like tokenization, lemmatization, and part-of-speech (POS) tagging?

a) spaCy.
b) pandas.
c) scikit-learn.
d) Matplotlib.

Question 10: Which library is most suitable to accelerate a pandas workflow on an NVIDIA GPU with minimal changes to existing code?

a) cuDF.
b) NumPy.
c) Apache Spark.
d) Dask.

Question 11: Which applications and corresponding evaluation metrics are appropriate for comparing generated text against human-created reference texts? (Choose two)

a) Application: image classification; Metric: accuracy.
b) Application: machine translation; Metric: BLEU.
c) Application: sentiment analysis; Metric: F1 score.
d) Application: text summarization; Metric: ROUGE.

Question 12: Which learning technique describes a model identifying a new class (e.g., an okapi) it has never seen during training, using only a text description?

a) One-shot learning.
b) Reinforcement learning.
c) Zero-shot learning.
d) Supervised learning.

Question 13: What are the primary advantages of using a multi-GPU configuration compared to a single GPU? (Choose two)

a) It inherently increases the final predictive accuracy of the model.
b) It reduces the time required for the training process.
c) It eliminates the need for data pre-processing.
d) It enables the training of larger models that exceed the memory capacity of a single GPU.

Question 14: If a real-time AI service has a bottleneck in the data encoding step, what is the most direct outcome of accelerating this step with a GPU?

a) Predictive accuracy will increase.
b) The model will require less training data.
c) Operational costs will be completely eliminated.
d) The overall latency of the service will decrease.

Question 15: Within the transformer architecture, what specific component is introduced to provide the model with information about word order and position?

a) The feed-forward network.
b) Word embeddings.
c) Positional encoding.
d) Attention mechanism.

Question 16: Which NVIDIA framework is specifically designed to facilitate building, training, and customizing large language models (LLMs)?

a) NVIDIA Triton Inference Server.
b) NVIDIA TensorRT.
c) NVIDIA NeMo Framework.
d) NVIDIA cuDF.

Question 17: Which NVIDIA SDK is specifically designed for deploying high-performance conversational AI services like speech-to-text and text-to-speech?

a) NVIDIA Omniverse.
b) NVIDIA NeMo Framework.
c) NVIDIA cuDF.
d) NVIDIA Riva.

Question 18: Which NVIDIA tool is a scalable solution designed to deploy and manage multiple trained AI models regardless of their original framework?

a) NVIDIA NeMo Framework.
b) NVIDIA Triton Inference Server.
c) NVIDIA TAO Toolkit.
d) NVIDIA TensorRT.

Question 19: Which open-source toolkit from NVIDIA is designed to add programmable safety controls (e.g., preventing prompt injections) to a chatbot?

a) NVIDIA Triton Inference Server.
b) NVIDIA TensorRT.
c) NeMo Guardrails.
d) NVIDIA cuML.

Question 20: Which embedding model is designed to generate contextualized word representations, allowing it to differentiate meanings of the same word based on surrounding context?

a) Word2Vec.
b) TF-IDF.
c) One-hot encoding.
d) BERT.

Question 21: Which of the following is a tree-based ensemble algorithm widely recognized for its high performance and efficiency in prediction tasks using structured tabular data?

a) k-means clustering
b) A large-scale transformer model
c) XGBoost
d) Linear regression

Question 22: When evaluating the performance of a language model, what does a lower perplexity score signify?

a) The model has a larger number of parameters.
b) The model is more surprised by the test data.
c) The model was trained on a smaller data set.
d) The model is more confident and accurate in its predictions.

Question 23: In the context of natural language processing, what is a key distinction between the perplexity metric and the BLEU score?

a) A lower score is better for BLEU while a higher score is better for perplexity.
b) Perplexity is used for image classification while BLEU is used for machine translation.
c) Perplexity evaluates how well a language model predicts a sequence of text, while BLEU compares a model's generated output against a human reference.
d) Both metrics are identical and can be used interchangeably to evaluate any NLP task.

a) The generator, which is an LLM that synthesizes the final answer based on the augmented prompt.
b) The fine-tuning module, which continuously retrains the LLM on new data.
c) The retriever, which searches a knowledge base for relevant information to augment the prompt.
d) The output validator, which checks the generated text for grammatical errors.

Question 25: Which NVIDIA platform is specifically designed for an end-to-end GPU-accelerated data science pipeline, offering a familiar Python API similar to pandas and scikit-learn?

a) Apache Spark
b) Matplotlib
c) NVIDIA RAPIDS
d) NVIDIA NeMo Framework

Question 26: In the initial phase of a data science project, what are the primary objectives of Exploratory Data Analysis (EDA)? (Choose two)

a) To deploy the final machine learning model into a production environment.
b) To identify potential anomalies, errors, and missing values within the data set.
c) To select the final hyperparameters for the model through grid search.
d) To summarize the data's main characteristics and uncover initial patterns and relationships.

Question 27: The core mechanism of a diffusion model involves two distinct phases. What are these two phases?

a) A compression phase (downsizing) and a rendering phase (upscaling).
b) A forward process where noise is progressively added to an image and a reverse process where a model learns to denoise it.
c) A feature extraction phase using a convolutional network and a classification phase using a feed-forward network.
d) A generator phase that creates an image and a discriminator phase that validates its authenticity.

Question 28: Which NVIDIA SDK is specifically designed to maximize throughput and minimize latency during inference on NVIDIA GPUs without changing the model's architecture?

a) NVIDIA Triton Inference Server
b) NVIDIA NeMo Framework
c) NVIDIA TensorRT
d) scikit-learn

a) NVIDIA TensorRT
b) ONNX (Open Neural Network Exchange)
c) A Docker container
d) A Python pickle file

Question 30: In a neural network, what are the primary roles of an activation function? (Choose two)

a) To introduce nonlinear properties, allowing the network to learn complex data patterns.
b) To group the input data into distinct clusters similar to the k-means algorithm.
c) To determine the output signal of a neuron, deciding whether it should be activated based on its weighted input.
d) To normalize the input data to have a mean of zero and a standard deviation of one.
e) To directly calculate and apply the weight and bias updates during back propagation.

Question 31: Which of the following algorithms is most suitable for an unsupervised clustering task where the data set does not contain any pre-existing labels?

a) Linear regression
b) A classification algorithm like Support Vector Machine (SVM)
c) Generative Adversarial Network (GAN)
d) k-means clustering

a) Sentiment analysis
b) Text summarization
c) Machine translation
d) Named Entity Recognition (NER)

Question 33: When optimizing a deep learning model for deployment, what are the primary advantages of applying model quantization? (Choose two)

a) A significant increase in the model's predictive accuracy.
b) Faster inference speed due to the use of lower-precision integer arithmetic.
c) A reduced memory footprint and smaller model file size.
d) It simplifies the process of data collection and pre-processing.
e) It allows the model to be trained with much less data.

Question 34: The process of model quantization often involves converting weights from FP32 to int8. What are the direct consequences of this process? (Choose three)

a) Increased computational throughput and lower inference latency.
b) A reduction in the numerical precision of the model's parameters.
c) A guaranteed improvement in the model's predictive accuracy.
d) Smaller model size and reduced memory (RAM/VRAM) requirements.
e) A significant increase in the time required to train the model.

Question 35: To evaluate a binary classification model's performance by providing a balance between precision and recall, which of the following metrics is most suitable?

a) BLEU score
b) Perplexity
c) Mean Absolute Error (MAE)
d) F1 score

a) Accuracy is only used for regression tasks, not classification.
b) A model that simply predicts "not fraud" for every transaction would also achieve 99% accuracy.
c) 99% accuracy is generally considered too low for financial applications.
d) The F1 score must be calculated first to determine the accuracy.

Question 37: Which technology enables the partitioning of a single physical GPU into multiple independent, securely isolated instances with guaranteed portions of compute and memory?

a) NVLink
b) Data parallelism
c) NVIDIA RAPIDS
d) Multi-Instance GPU (MIG)

a) k-means clustering
b) Decision tree analysis
c) A/B testing
d) Exploratory Data Analysis (EDA)

Question 39: Which of the following statements accurately describe the characteristics of a single decision tree algorithm? (Choose two)

a) It is an unsupervised learning algorithm used primarily for clustering data without labels.
b) It can be used for both classification and regression tasks.
c) It is a type of deep neural network that uses activation functions and back propagation.
d) It makes predictions by learning a series of simple if-then-else rules from the data features.
e) It operates by building a multitude of trees on random subsets of data and averaging their predictions.

Question 40: What is the text normalization technique called that transforms different forms of a word (like "is," "am," "are") back to their base dictionary form or lemma (like "be")?

a) Stemming
b) Tokenization
c) Named Entity Recognition (NER)
d) Lemmatization

Question 41: Given the sentence "The quick brown fox jumps," what is the correct set of bigrams (2-grams)?

a) the, quick, brown, fox, jumps
b) A similarity score used to compare it with another sentence
c) {the quick, quick brown, brown fox, fox jumps}
d) the quick brown, quick brown fox, brown fox jumps

a) t-SNE is primarily used for preserving the local structure of the data, making it excellent for visualizing distinct clusters.
b) PCA is a supervised learning technique that requires labeled data to perform dimensionality reduction.
c) Both PCA and t-SNE are guaranteed to find the same patterns and produce nearly identical visualizations.
d) PCA is a linear technique that aims to capture the maximum variance in the data, making it effective for understanding the global data structure.
e) The primary purpose of both techniques is to train a predictive model for classification.

a) Hyperparameter
b) Checkpoint
c) Corpus
d) Algorithm

Question 44: What is the primary purpose of the technique known as dropout during the training of neural networks?

a) To introduce nonlinearity into the network, allowing it to learn complex patterns.
b) To permanently remove unimportant neurons after training to reduce the model's file size.
c) To prevent the model from overfitting by making it learn more robust and redundant features.
d) To calculate the gradient of the loss function with respect to the model's weights.

a) Time series forecasting
b) Image classification and object detection
c) Customer segmentation using k-means
d) Predicting housing prices from a tabular data set

Question 46: What is the key architectural feature of a recurrent neural network (RNN) that enables it to process sequential data?

a) A set of convolutional layers designed to detect spatial patterns.
b) A purely feed-forward structure where information only moves in one direction without loops.
c) The exclusive use of linear activation functions throughout the network.
d) A feedback loop or hidden state that passes information from one time step to the next.

Question 47: During the training of a neural network, what is the primary purpose of the backpropagation algorithm?

a) To perform the forward pass where input data is fed through the network to generate a prediction.
b) To randomly deactivate neurons in a layer to prevent the model from overfitting.
c) To initialize the weights and biases of the network before the training process begins.
d) To calculate the gradient of the loss function with respect to the network's weights, enabling learning.
Question 48: What is the primary purpose of the NVIDIA NGC catalog?
a) An online marketplace for buying and selling third-party software licenses.
b) A source code repository for collaborating on open-source projects similar to GitHub.
c) A cloud computing platform for renting virtual machines equipped with GPUs.
d) A central hub for accessing GPU-accelerated software such as containers, pre-trained models, and SDKs.

Question 49: In the context of training a machine learning model, what is the correct definition of one epoch?

a) A single forward and backward pass of one batch of data through the network.
b) The final evaluation of the model's performance on the test data set.
c) One complete pass through the entire training data set.
d) The process of tuning the model's hyperparameters such as the learning rate.

Question 50: What is the primary role of the NVIDIA Base Command Platform in an AI-powered data center?

a) Hardware interconnect for linking multiple DGX systems with high bandwidth.
b) A command-line interface for monitoring the status of a single NVIDIA GPU.
c) The software suite for managing AI development workflows and orchestrating DGX infrastructure.
d) A library for accelerating data processing tasks similar to pandas but on the GPU.

Question 51: Which software solution is specifically designed to meet enterprise-grade requirements such as long-term stability, robust security, and professional support (SLAs)?

a) Downloading the latest open-source packages directly from public repositories.
b) NVIDIA AI Enterprise
c) A standard cloud virtual machine with a base operating system.
d) A single open-source framework like PyTorch.

Question 52: Which NVIDIA offering is designed to package generative AI models as easy-to-use, scalable microservices with standard APIs to simplify integration?

a) NVIDIA DGX system
b) NVIDIA NIM (NVIDIA Inference Microservices)
c) NVIDIA NeMo Framework
d) NVIDIA Triton Inference Server

Question 53: What is the primary function of the backpropagation algorithm in the training of a neural network such as an RNN?

a) To distribute parts of a model across multiple GPUs for parallel computation.
b) To serve the trained model in a production environment with low latency.
c) To randomly ignore certain neurons during training to prevent overfitting.
d) To enable the network to learn by calculating the gradient of the loss function with respect to the network's weights.

Question 54: Which NVIDIA offering provides a collection of optimized microservices specifically for core RAG tasks, including embedding documents and searching for relevant context?

a) NVIDIA Omniverse
b) NeMo Guardrails
c) NVIDIA NeMo Retriever
d) A custom Python script using scikit-learn

a) Use NVIDIA AI workflows for connecting models and LangChain for building the application.
b) Use NVIDIA AI workflows for both building the application and connecting the models.
c) Use NVIDIA AI workflows for building the application and LangChain for connecting the models.
d) Use LangChain for both building the application and connecting the models.

Question 56: Which of the following components and techniques are key to the parallel processing capability of the transformer architecture? (Choose two)

a) A recurrent feedback loop that processes tokens one by one in a strict sequence.
b) Multi-head attention, which allows different attention heads to be computed independently and in parallel.
c) The model's ability to process all input tokens simultaneously rather than sequentially.
d) A k-means clustering algorithm to group similar tokens before processing.
e) A single attention mechanism that must process all relationships serially.

Question 57: Which technology is specifically designed to protect sensitive data while it is actively being processed in memory during use?

a) Standard software-based encryption
b) A virtual private network (VPN)
c) Confidential Computing
d) A network firewall

Question 58: Which activation function is most suitable for the output layer of a binary classification problem to represent a probability (0 to 1)?

a) ReLU (Rectified Linear Unit)
b) Tanh (Hyperbolic Tangent)
c) Softmax
d) Sigmoid

a) Full fine-tuning
b) k-means clustering
c) LoRA (Low-Rank Adaptation)
d) Post-training quantization (PTQ)

Question 60: What is the primary characteristic that distinguishes generative AI from other types of AI such as predictive or analytical AI?

a) Its ability to cluster unlabeled data into distinct groups.
b) Its ability to classify existing data into predefined categories.
c) Its ability to create new, original content such as text, images, or music that resembles the data it was trained on.
d) Its ability to predict a continuous numerical value based on input features.

Question 61: In the transformer architecture, what is the primary purpose of positional encoding?

a) To capture the semantic meaning of each word in the vocabulary.
b) To calculate the contextual importance of each word relative to others in the sequence.
c) To provide the model with information about the order and position of tokens in a sequence.
d) To apply a final nonlinear transformation to the output of the attention block.

a) To generate a numerical vector representation for each individual word in the vocabulary.
b) To classify a document into predefined categories like sports or politics.
c) To generate a single numerical vector that represents the semantic meaning of an entire document or paragraph.
d) To create a new synthetically generated document that is stylistically similar to the input.

Question 63: Which of the following options correctly lists the steps of a typical machine learning workflow in the proper sequential order?

a) Model training, model evaluation, data collection, data prep-processing, model deployment.
b) Data collection, model deployment, data prep-processing, model training, model evaluation.
c) Data collection, data prep-processing, model training, model evaluation, model deployment.
d) Data prep-processing, data collection, model training, model deployment, model evaluation.

Question 64: In the process of training a neural network, what is the primary role of the back propagation algorithm?

a) To perform the initial forward pass of data through the network to generate an output.
b) To introduce nonlinearity into a neuron's output.
c) To randomly deactivate a portion of neurons during training to prevent overfitting.
d) To calculate the gradient of the loss function with respect to each weight, enabling the network to learn.

Question 65: What is the primary purpose of a Generative Adversarial Network (GAN)?

a) To classify input data into one of several predefined categories.
b) To generate new synthetic data samples that mimic a real data set.
c) To predict a continuous numerical value based on a set of input features.
d) To group unlabeled data points into clusters based on their similarity.

Question 66: How does an auto regressive language model, such as a GPT-style model, generate text?

a) It analyzes the entire input text at once, considering both past and future words to understand context.
b) It classifies the entire text into a single category such as positive or negative.
c) It generates the next token in a sequence based only on the sequence of tokens that it has generated in previous steps.
d) It groups similar documents together into clusters without using any labels.

Question 67: What is the primary application of the t-SNE (t-distributed Stochastic Neighbor Embedding) algorithm in data science?

a) To train a model to classify data into predefined categories.
b) To normalize the scale of features in a data set before model training.
c) To generate new synthetic data samples that are similar to a real data set.
d) To visualize high-dimensional data in a low-dimensional space such as 2D or 3D to reveal underlying structures and clusters.

Question 68: In the context of evaluating the trustworthiness of an AI model, what is the key difference between reliability and validity?

a) Reliability measures if the model produces consistent results, while validity measures if the model is actually correct and measures what it intends to.
b) Reliability measures the model's accuracy on the training data, while validity measures its accuracy on the test data.
c) Reliability refers to the model's inference speed, while validity refers to its memory usage.
d) Both terms mean the same thing and can be used interchangeably to describe a model's overall accuracy.

a) Using a larger batch size.
b) Applying data augmentation.
c) Using pinned memory (page-locked memory).
d) Increasing the number of layers in the neural network.

Question 70: What is a key shared characteristic between the XGBoost library and the NVIDIA RAPIDS cuML library?

a) Both are designed exclusively for deep learning and neural network tasks.
b) Both are primarily used for unsupervised learning tasks like clustering.
c) Both can be significantly accelerated using NVIDIA GPUs.
d) Both are core components of the Apache Spark ecosystem for distributed computing.

Question 71: What is the correct term for the phenomenon where an AI model generates a factually incorrect or nonsensical statement but presents it confidently as if it were true?

a) Overfitting.
b) Bias.
c) Hallucination.
d) Quantization error.

Question 72: Select two models or techniques that are used to create numerical vector representations (embeddings) of text. (Choose two)

a) k-means clustering.
b) BERT.
c) Back propagation.
d) Doc2Vec.
e) A standard convolutional neural network (CNN).

Question 73: To adhere to the principle of privacy in trustworthy AI, what is a fundamental practice a company must implement when using personal user data for training?

a) Collect as much user data as possible to maximize model accuracy.
b) Use the data for training without explicitly informing users to speed up development.
c) Provide users with clear information and the ability to consent to or opt out of their data being used.
d) Switch to a more complex model architecture to automatically handle privacy.

Question 74: Which NVIDIA framework provides a comprehensive end-to-end platform that supports techniques ranging from simple prompt engineering to PEFT and full fine-tuning?

a) NVIDIA Triton Inference Server.
b) NVIDIA DGX system.
c) NVIDIA TensorRT.
d) NVIDIA NeMo framework.

Question 75: In the context of AI governance and trustworthy systems, what does certification refer to?

a) The process of training the AI model on a large, diverse data set.
b) An informal code review conducted by the internal development team.
c) The formal process of verifying that an AI system meets established standards for aspects like security, fairness, and reliability.
d) The process of optimizing the model's code for faster inference speed.

Question 76: What is the key feature of the NVIDIA Triton Inference Server that automatically groups incoming real-time requests into larger batches for more efficient GPU processing?

a) Model fine-tuning.
b) Dynamic batching.
c) Load balancing.
d) Post-training quantization.

Question 77: According to the concept of scaling laws in deep learning, what is one of the most reliable ways to predictably improve the performance of a large language model?

a) Changing the activation function from ReLU to sigmoid.
b) Increasing the size of the training data set and the number of model parameters.
c) Applying post-training quantization to the model.
d) Switching to a different optimization algorithm like from Adam to SGD.

Question 78: Which of the following statements best describes the primary roles of ONNX and NVIDIA TensorRT in a machine learning workflow?

a) ONNX is used to train neural networks while TensorRT is used to collect and pre-process data.
b) ONNX is an open format for model interoperability while TensorRT is an SDK for optimizing and deploying models for high-performance inference.
c) Both ONNX and TensorRT are used for the same purpose of training models from scratch.
d) ONNX is an SDK for optimizing model performance while TensorRT is an open format for model interoperability.

Question 79: What are the two primary benefits of applying model quantization to a trained deep learning model? (Choose two)

a) Increased inference throughput.
b) Significantly improved predictive accuracy.
c) Reduced model memory requirements.
d) Simplified data collection process.
e) Reduced model training time.

Question 80: Which of the following metrics is the industry standard for comparing machine-generated translations against several high-quality human-expert translations?

a) Accuracy.
b) Mean Absolute Error (MAE).
c) Bilingual Evaluation Understudy (BLEU) score.
d) Perplexity.

Question 81: A company decides to switch from using a 7 billion parameter language model to a much larger 70 billion parameter model for their application.

According to AI scaling laws, what are the typical consequences of this change? (Choose two)

a) The model will require significantly less data to train.
b) The model's performance and accuracy will likely increase.
c) The model will become less expensive to host and run.
d) The model's inference time latency will likely increase.
e) The model will be easier to deploy on edge devices with limited memory.

a) One-shot prompting
b) Fine-tuning
c) Zero-shot prompting
d) Few-shot prompting

Question 83: In natural language processing, what is the primary goal of using text normalization techniques like stemming and lemmatization?

a) To generate new sentences that are grammatically correct.
b) To classify the sentiment of a text as positive, negative, or neutral.
c) To reduce inflectional forms of a word to a common base or root form.
d) To identify and categorize named entities such as persons, organizations, and locations.

Which of the following Python libraries is an industry standard high-performance tool specifically designed for these purposes?

a) pandas
b) Matplotlib
c) spaCy
d) scikit-learn

Question 85: What is the typical relationship between the ONNX format and NVIDIA TensorRT in a model deployment workflow?

a) TensorRT is used to convert a trained model into the ONNX format.
b) A model is often exported to the ONNX format, which is then used as an input for optimization by TensorRT.
c) ONNX and TensorRT are competing standards that are mutually exclusive and cannot be used together.
d) Both ONNX and TensorRT are frameworks used for training neural networks from scratch.

Question 86: Before fine-tuning a pre-trained language model on a new custom data set, why is it a crucial first step to perform exploratory data analysis (EDA)?

a) EDA is the process of deploying the model to a production server.
b) EDA helps to understand the data's quality, identify potential issues like biases or anomalies, and inform the fine-tuning strategy.
c) EDA is a technique used to significantly increase the number of parameters in the model.
d) EDA is an unnecessary step as pre-trained models are already robust enough to handle any type of data.

This is an example of the model violating which key principle of trustworthy AI?
a) Performance
b) Bias/Fairness
c) Reliability
d) Privacy

This model lacks which important characteristic of trustworthy AI?

a) Explainability/Transparency
b) Scalability
c) Accuracy
d) Robustness

This model is failing in which key area of trustworthy AI?

a) Privacy
b) Fairness
c) Robustness
d) Transparency

Question 90: In the context of a trustworthy AI model, which statement correctly defines validity and reliability?

a) Validity refers to how fast the model makes a prediction; reliability refers to how much memory it uses.
b) Validity means the model produces consistent outputs for the same input; reliability means the model's outputs are factually correct.
c) Validity and reliability are the same and both measure the model's overall accuracy.
d) Validity means the model's outputs are factually correct and measure what is intended; reliability means the model produces consistent outputs for the same input.

a) Model performance increases linearly with the amount of training data without any upper limit.
b) Model performance improves predictably as scale increases, but the rate of improvement diminishes.
c) A model reaches its optimal performance with a relatively small amount of data and adding more data provides no benefit.
d) The performance of a model is completely independent of its size or the amount of training data.

a) Broadcast
b) Scatter
c) All-reduce
d) Data augmentation

Question 93: What is the primary purpose of applying model quantization in a deep learning workflow?

a) To increase the predictive accuracy of the model.
b) To accelerate inference speed and reduce the model's memory footprint.
c) To augment the training data set with new synthetic examples.
d) To train a model from scratch with fewer parameters.

Question 94: A user provides the following prompt to a large language model: "Translate English to French: Sea otter = Loutre de mer, Cheese = ". This is an example of which prompting technique?

a) Fine-tuning
b) Few-shot prompting
c) Zero-shot prompting
d) One-shot prompting

Which of the following Python libraries is an industry standard high-performance tool specifically designed for these purposes?

a) pandas
b) Matplotlib
c) spaCy
d) scikit-learn

Question 96: What is the primary purpose of the GLUE (General Language Understanding Evaluation) benchmark in the field of natural language processing?

a) To provide a large unlabeled corpus for pre-training language models.
b) To serve as a standardized suite of diverse NLP tasks for evaluating a model's general language understanding capabilities.
c) To be a specific type of model architecture similar to a transformer.
d) To be a software library for deploying language models into production.

a) BLEU is for text summarization (recall-focused) while ROUGE is for machine translation (precision-focused).
b) Both metrics are primarily used for evaluating the accuracy of classification models.
c) BLEU is for machine translation (precision-focused) while ROUGE is for text summarization (recall-focused).
d) Both metrics are identical and can be used interchangeably for any text generation task.

a) Overfitting
b) Bias
c) Hallucination
d) Data poisoning

a) k-means clustering
b) Decision tree analysis
c) A/B testing
d) Exploratory Data Analysis (EDA)

Question 100: In text pre-processing, both stemming and lemmatization are used to reduce words to their root form. What is the key difference between these two techniques?

a) Stemming is a more modern technique that always produces more accurate results than lemmatization.
b) Stemming uses dictionary and morphological analysis to find a word's base form while lemmatization uses simple rules.
c) There is no difference; both are identical techniques.
d) Lemmatization uses dictionary and morphological analysis to find a word's base form while stemming uses simple rules to chop off word endings.

Question 101: During the training of a neural network, what is the primary purpose of the back propagation algorithm?

a) To perform the forward pass where input data is fed through the network to generate a prediction.
b) To randomly deactivate neurons in a layer to prevent the model from overfitting.
c) To initialize the weights and biases of the network before the training process begins.
d) To calculate the gradient of the loss function with respect to the network's weights, enabling learning.

a) Apache Spark
b) Matplotlib
c) NVIDIA RAPIDS
d) NVIDIA NeMo framework

Question 103: When building a neural network for a binary classification task, what are the typical use cases for the sigmoid and ReLU activation functions?

a) Both are interchangeable and can be used in either the hidden layers or the output layer.
b) ReLU is used in the output layer to produce a probability while sigmoid is used in the hidden layers.
c) Sigmoid is used in the output layer to produce a probability while ReLU is used in the hidden layers.
d) Both are primarily used for regression tasks not classification.

Question 104: A company decides to switch from a 7 billion parameter model to a 70 billion parameter model. According to AI scaling laws, what are the typical consequences? (Choose two)

a) The model will require significantly less data to train.
b) The model's performance and accuracy will likely increase.
c) The model will become less expensive to host and run.
d) The model's inference time latency will likely increase.
e) The model will be easier to deploy on edge devices with limited memory.

Question 106: How does the Retrieval-Augmented Generation (RAG) technique primarily enhance the trustworthiness of a large language model's outputs?

a) By encrypting the model's training data to ensure privacy.
b) By grounding the model's responses in a specific verifiable and up-to-date knowledge base, reducing hallucinations.
c) By significantly increasing the number of parameters in the model to make it more intelligent.
d) By automatically detecting and removing biases from the model's original training corpus.

Question 107: What is the primary purpose of the NVIDIA NGC catalog?

a) An online marketplace for buying and selling third-party software licenses.
b) A source code repository for collaborating on open-source projects similar to GitHub.
c) A central hub for accessing GPU-accelerated software such as containers, pre-trained models, and SDKs.
d) A cloud computing platform for renting virtual machines equipped with GPUs.

Question 108: Which NVIDIA framework provides an end-to-end platform that supports customization techniques ranging from prompt engineering to PEFT and full fine-tuning?

a) NVIDIA Triton Inference Server
b) NVIDIA DGX system
c) NVIDIA TensorRT
d) NVIDIA NeMo framework

Question 110: When transferring data from the CPU to the GPU becomes a bottleneck, which memory optimization technique can significantly improve throughput?

a) Using a larger batch size.
b) Applying data augmentation.
c) Using pinned memory (page-locked memory).
d) Increasing the number of layers in the neural network.

Question 111: Which of the following metrics is the industry standard for evaluating machine translation quality by comparing it against human expert translations?

a) Accuracy
b) Mean Absolute Error (MAE)
c) BLEU (Bilingual Evaluation Understudy) score
d) Perplexity

Question 112: Which technique is specifically designed to learn a dense numerical vector representation for each individual word in a vocabulary based on its surrounding context?

a) Doc2Vec
b) Word2Vec
c) Sentiment analysis
d) Tokenization

Question 113: Which option correctly matches the NVIDIA data center GPUs with their respective microarchitectures?

a) A100-Hopper, H100-Volta, V100-Ampere
b) A100-Volta, H100-Ampere, V100-Hopper
c) A100-Ampere, H100-Hopper, V100-Volta
d) A100-Ampere, H100-Volta, V100-Hopper

Question 114: What is the NLP task called that automatically determines whether the opinion in a review is positive, negative, or neutral?

a) Named Entity Recognition (NER)
b) Machine translation
c) Sentiment analysis
d) Text summarization

Question 115: Which emerging technology provides a verifiable "nutrition label" for AI-generated or human-edited content, securely attaching information about its origin and edit history?

a) A/B testing
b) SHA-256 hashing
c) Content Credentials (C2PA standard)
d) Generative Adversarial Networks (GANs)

Question 116: Which statement best describes the flexibility of the NVIDIA Triton Inference Server in handling different models?

a) Triton can only deploy models that were trained using the NVIDIA NeMo framework.
b) Triton can deploy models from virtually any framework, such as TensorFlow, PyTorch, TensorRT, and ONNX.
c) Triton automatically retrains models if their performance degrades over time.
d) Triton is a software library for data processing, not for deploying models.

Question 117: Which performance optimization feature of the NVIDIA Triton Inference Server automatically groups individual requests into a single larger batch?

a) Model versioning
b) Dynamic batching
c) Health monitoring
d) Rate limiting

Question 118: Where does the NVIDIA Triton Inference Server load its models and configurations from?

a) Directly from a public GitHub repository.
b) A local or cloud-based storage location known as the model repository.
c) An integrated development environment (IDE) like VS Code.
d) The NVIDIA NGC catalog exclusively.

Question 119: What is the primary role of the vector database in a Retrieval-Augmented Generation (RAG) system?

a) To store the large language model (LLM) itself.
b) To store and index numerical embeddings of text for efficient similarity search.
c) To fine-tune the large language model on new data.
d) To store the chat history of user conversations.

Question 120: What is the primary function of the RAG technique in the context of large language models?

a) A method for fine-tuning a model by updating all of its weights on a new data set.
b) A technique that enhances a model's responses by first retrieving relevant information from an external knowledge base.
c) A type of hardware accelerator designed to speed up model training.
d) A data encryption standard used to protect the model's training data.

Written by Hitesh Sahu, a passionate developer and blogger.

Tue May 26 2026

Share This on

← Previous

Diffusion Models Explained

Pinned Memory (Page-Locked Memory) in CUDA and GPU Computing

AI-GenAI/Sample-Test