Q2 of 2 You are required to build an artificial intelligence model on enormous data which can identify among 10 electronic device images. You are provided with ample computational resources, however, the board needs a proper explanation on your final model outcomes. What will be your take on this?

You first start with a classical machine learning model and compare its performance with that of a deep learning model. If the difference isn’t big, then you finalize the classical machine learning model.
You directly start building a deep Learning model to gain high performance using available data and resources.
You can build either a classical machine learning model or a deep learning model as the output of both the models can be interpreted.

2. Optimization Algorithms

Q1 of 2 Which variant of gradient descent (not adaptive learning rate algorithms) takes only k (k not equal to 1) number of samples every epoch to find the optimal parameters?

Mini-batch
Stochastic
Adam
Adagrad

Explanation : Mini-batch takes a specific number of samples which are constant throught out the epochs.

Q2 of 2 What is true if you use a large learning rate steps?

The algorithm may skip the optimal local minima and may never converge.
The algorithm can find the optimal local minima soon leading to early convergence.
The optimum learning rate falls in the range of 0.1 to 1.
The optimum learning rate falls in the range of 0.01 to 0.1.

Explanation :Large steps make the algorithm keeps bouncing leading to divergence.

3. Activation Functions- Quiz

Q1 of 4 What is true about sigmoid activation function?

It may saturate and kill gradients. check
It has slow convergence. check
It's output isn’t zero centered. check
It follows a U-shaped curve.

Q2 of 4 Which of the following activation function follows the zero-centered principle i.e. keeping its output within the range of [-1, 1]?

Sigmoid
Tanh
Linear
ReLU

Q3 of 4 Which of the following points about ELU makes it noise-robust?

It saturates to a negative value for smaller arguments, implying smaller derivative.
It has diminished vanishing gradient effect.
Its output is linear for positive net inputs.
It has a tunable hyper-parameter, α.

Q4 of 4 Which of the following is a disadvantage of ELU activation function?

Negative output for negative weighted sum of inputs
Logarithmic curve for negative weighted sum of inputs
Additional hyperparameter, α that needs tuning which adds extra overhead to deep neural network training.
None of the above

4. Implementing ANN- Quiz

Q1 of 2 Given a data set of 100,000 RGB images of shape 50x50. How many nodes are required in the input layer to build a neural network?

7,500
2,500
100,000
50,000,000

Q2 of 2 Given two snippets to encode categorical target variable to numeric variables:

# Snippet 1

lb = LabelEncoder()

train_y = lb.fit_transform(train.Class)

train_y = utils.np_utils.to_categorical(train_y)

# Snippet 2

lb = LabelEncoder()

train_y = lb.fit_transform(train.Class)

train_y = utils.np_utils.to_categorical(train_y)

lb_T = LabelEncoder()

test_y = lb_T.fit_transform(test.Class)

test_y = utils.np_utils.to_categorical(test_y)

Which snippet will you prefer before you start building your ANN model?

Snippet 1
Snippet 2
Either snippet 1 or snippet 2
Option Snippet 1 is Correct

5. Hyperparameter Tuning - Quiz

Q1 of 2 Based on the given validation loss, which optimization algorithm is preferred:

SGD
Adadelta
Adagrad
Adam
RMSProp

Q2 of 2 Consider the given observations for different values of bias:

In which sequence will you choose the bias values starting from first priority to last priority?

0 - 0.01 - 0.1
0 - 0.1 - 0.01
0.01 - 0 - 0.1
0.01 - 0.1 - 0

6. Introducing CNN - Quiz

Q1 of 3 Find the value of 'X' for the given instance of convolution:

-2
0
4
-1

Q2 of 3 What filter size and stride value results in the following max-pooling operation?

Filter size: 2x2 | Stride: 2
Filter size: 1x1 | Stride: 2
Filter size: 2x2 | Stride: 1
Filter size: 1x1 | Stride: 1

Q3 of 3 The AlexNet (winner of ImageNet, 2012) used images of size [227 x 227 x 3]. Find the number of neurons connected to the input volume with given parameters.

Filter size, F=11
Stride, S=4
Zero-padding, P=0
Number of filters, K=96

290,400
55
290,455
290345

7. Implementation of CNN - Quiz

Q1 of 2 What units value would you pass to the output layer of a CNN model designed for a multi-class classification problem having five classes with 16, 32, and 64 filters assigned for in-between convolution layers respectively?

Q2 of 2 What is true about batch normalization?

It acts as an extra layer of normalization at the intermediate steps. check
It doesn't let the activation go higher or lower. check
It works in complement to the dropout regularization. check
It has a slight regularization property as it adds some noise to each hidden layer activation functions and thus helps avoid overfitting. check

8. Introducing RNN - Quiz

Q1 of 3 How many hidden layers are executed in a basic RNN cell when used for 50 time steps?

Q2 of 3 How many inputs does an LSTM cell receive at a given timestamp?

Q3 of 3 In LSTM, which gate controls of how much previous data needs to be retained?

Forget gate
Input gate
Output gate

9.LSTM - Quiz

Q1 of 2 Exploding gradient problem is an issue in training deep networks where the gradient getS so large that the loss goes to an infinitely high value and then explodes. What is the probable approach when dealing with “Exploding Gradient” problem in RNNs?

Use modified architectures like LSTM and GRUs
Dropout
Gradient clipping check
Batch normalization

Explanation : Yes you are correct ! To deal with exploding gradient problem, it’s best to threshold the gradient values at a specific point. This is called gradient clipping.

Q2 of 2 You need to find the sentiment of a given paragraph and apply this technique for future use. Which RNN architecture will be good for you?

One to one
One to many
Many to many
Many to one

10. GRU- Quiz

Q1 of 2 Which of the following statements is Incorrect about the input_dim parameter of tf.keras.layers.Embedding?

It must always be an integer value.
It gives the size of the vocabulary.
It gives the shape of the dense embedding.
It is given by the maximum possible inter index of any word in the vocabulary plus 1.

Q2 of 2 Given an input as shown below –

(3, 4)
(1, 2)
(1, 2)
(3, 2)

11. Introducing Auto-Encoders - Quiz

Q1 of 2 There are five layers in the encoder while constructing an auto-encoder network. How many layers are required to be in the decoder?

Five
Three
Four
Not fixed

Q2 of 2 What is true about auto-encoders?

They perform lossless compression.
They perform lossy compression.
The maximum amount encoder can compress is referred to as latent-space.
The maximum amount encoder can decompress is referred to as bottleneck.

12. Image denoising using auto-encoders - Quiz

Q1 of 3 Which Keras method is used to perform decompressing in the decoder module of the auto-encoders?

UpSampling2D
UpSampling
upsampling2D
upsampling

Q2 of 3 While training an auto-encoder model, you have passed train and target data as same dataset as shown:

...

model.fit(train_imgs, train_imgs, epochs=50, validation_split=0.2)

What is the use of the above model?

It can be used for image compression tasks.
It can be used for image denoising tasks.
It can be used for image edge detection tasks.
It can be used for image segmentation tasks.

Explanation :Since you have kept input and output similar, therefore, the model tries to compress the input while maintaining its original nature.

Q3 of 3 What is correct about auto-encoder architecture?

It is necessary to have same number of layers in both encoder and decoder. check
It is necessary that weights must be shared between both the modules. check
Upsampling layers can be present only in the encoder module.
Upsampling layers can be present either in encoder or decoder module. check

13. Advanced Deep Learning Architectures- Quiz

Q1 of 5 Real-time object detection is used in self-driving autonomous cars for 100% accurate detection of road signs, recognition of pedestrians, buildings in roadsides. Object detection is a combination of which of the following computer vision tasks:

Image colorization and object classification
Object localization and Object classification check
Image segmentation and object localization
Edge detection and object classification

Q2 of 5 YOLO works based on which type of Neural Network

Convolutional Neural Network check
Recurrent Neural Network
Artificial Neural Network
LSTM

Q3 of 5 Which of the following framework is the base for YOLO?

Darknet check
ImageNet
RNN
TensorFlow

Q4 of 5 Which of the following data set used in YOLO framework is

MNIST data set
CIFAR data Set
COCO data set

Q5 of 5 How many object YOLO can detect in an input image ?

14. Optimization of Training in Deep Learning Quiz

Q1 of 5 Cross entropy Loss is suitable for which of the following task:

Regression
Classification
Clustering
None of the above

Q2 of 5 Which of the following optimizer has a higher accuracy, steadiness in converseness and generalization in terms of loss function:

Adabound optimization check
Adadelta optimization
Adam optimization
Adagrad optimization

Q3 of 5 Which of the following Loss function is suitable for Noisy data

Robust Bi-Tempered logistic loss check
Minimax loss
Perceptual loss
None of the above

Q4 of 5 Which of the below given aspect of data can be handled Bi-tempered loss function:

Outliers
Imbalanced data
Outliers and mislabeled data check
Unlabeled data

Q5 of 5 GAN loss function is used to:

Minimize the generator’s loss and maximize the discriminator’s loss check
Minimize the discriminator’s loss and maximize the generator’s loss
Minimize generator’s loss minimize the discriminator’s loss
Maximize generator’s loss maximize the discriminator’s loss

15. Advanced CNN- Quiz

Q1 of 5 Which of the following type of CNN was used for commercial purpose in ATM and Banks?

AlexNet
Convnet
LeNet check
None of the above

Q2 of 5 The idea of filter, padding and hyperparameters was first introduced in which type of the following CNN:

AlexNet check
LeNet
ConvNet
Neocognition

Q3 of 5 Semantic and instance-based object segmentation can be successfully handled by the following CNN:

ConvNet
AlexNet
VGGNet check
None of the above

Q4 of 5 AlexNet architecture consists of:

5 convolutional layers and 3 fully connected layers check
3 convolutional layers and 5 fully connected layers
5 convolutional layers and 1 fully connected layer
1 convolutional layer and 1 fully connected layers

Q5 of 5Which of the following activation function is used on AlexNet

ReLU check
Tanh
SoftMax
None of the above

16. Advanced Gan- Quiz

Q1 of 5Which of the following is the task performed by Generative models?

Creation of new examples based on the input data check
Classification of test samples
Regression of unseen data
Clustering the data

Q2 of 5The generator component of the GAN architecture is:

The generator component of the model
The learning component of a GAN model check
The testing component of GAN model
The validation component of GAN model

Q3 of 5 Which of the following is the application of GAN architecture?

Generate human faces from examples
Produce the new samples for the data set
Create new picture based on the object description in sentence
All the above check

Q4 of 5 Which of the following is the task of discriminator model?

Accepts the output of Generator and classify it is as fake or real check
To train the model to create new data
To preprocess the data suitable for raining
None of the above

Q5 of 5 ZCA-Zero Component Analysis is used for the following purpose:

To reduce the redundant pixels of an image check
To extract the features of an image
To segment the image
To classify the image

17. Wide and deep module- Quiz

Q1 of 3 Wide and deep models is developed by

Facebook
Google check
IBM
None of the above

Q2 of 3 In wide and deep learning model, wide linear model is used to achieve:

Memorization check
Generalization
Memorization and generalization
None of the above

Q3 of 3 App recommender in Google play is developed based on which of the following technology:

CNN
RNN
LSTM
Wide and deep model

Which of the following is not recommended while initializing weights in a network? Question 1 Zero ! ! He 2 16 10 Glorot ! Random 14 Reset 24 22 28 26 22 31 Save 19 20 32

ChatGPT said:

ChatGPT

The option "Zero" is not recommended for initializing weights in a neural network. Initializing weights to zero can lead to symmetry problems where all neurons in a layer compute the same gradients, making them update identically during backpropagation and effectively reducing the network to a much simpler one. It's generally better to initialize weights with small random values (e.g., using methods like Glorot, He, or Random initialization) to break symmetry.

You said:

Given a video clip, you have to identify the action that is happening in it. You chose a sequence model because the input is a sequence 12 15 the output is a sequence both input and output are sequences Osequence models learn faster 16 19 20 23 24 Reset

ChatGPT said:

ChatGPT

You would choose a sequence model for video action recognition because the input is a sequence. A video clip is a series of frames, and a sequence model like an LSTM or a Transformer can capture temporal dependencies across these frames to identify the action happening in the clip.

You said:

Enabling the machine to combine the power of memorization and Generalization is called as: OVGGNet Wide and deep model 16 GoogleNet 20 19 ORNN 24 Reset 11 12 15 Save

ChatGPT said:

ChatGPT

The term you're looking for is Wide and Deep model. This type of model combines the power of memorization (through wide models) and generalization (through deep models) in machine learning. It was popularized by Google and is often used in recommendation systems.

You said:

What should be the value of filter size and stride to arrive at the following results? 10 11 Question 4 4 6 7 8 12 14 15 16 Filter size: (2x2); Stride: 2 Filter size: (2x2); Stride: 1 Filter size: (1x1); Stride: 1 23 24 Filter size: (2x1); Stride: 2

ChatGPT said:

ChatGPT

To determine the filter size and stride based on the given options, here's how they would generally apply:

Filter size: (2x2); Stride: 2:
- This would reduce the size of the input by half in both dimensions.
Filter size: (2x2); Stride: 1:
- This would scan the input with overlapping windows, reducing the size slightly depending on the input size.
Filter size: (1x1); Stride: 1:
- This would keep the spatial dimensions of the input unchanged, but it would allow for the adjustment of the depth dimension.
Filter size: (2x1); Stride: 2:
- This would reduce the size of the input by half in the height dimension and keep the width dimension unchanged.

The correct filter size and stride to use depends on the specific change you're looking to make in the feature map.

You said:

The task is predicting the most probable next character for a given input stream of characters. You have built the following RNN model for the task- Here, denotes the actual character at time step 1, and denotes the predicted character for time step 1. Given the above model and the notations, what does the RNN output at time step f? P(A) denotes probability of A, and P(A/B) denotes probability of A given B 00 10 12 16 14 5 15 19 8 20 The character with the maximum value of P(y_t | y_(t-1)) The character with the maximum value of P(y_t| y_(t-1), y(t-2)..... y1) The character with the maximum value of Ply_t) The character with the maximum value of P(y_(t-1)) 28 Reset

ChatGPT said:

ChatGPT

In an RNN model designed to predict the next character in a sequence, the output at time step ttt would typically be the character with the maximum value of P(yt∣yt−1,yt−2,…,y1)P(y_t | y_{t-1}, y_{t-2}, \ldots, y_1)P(yt∣yt−1,yt−2,…,y1). This is because the RNN uses information from all previous time steps to predict the next character.

So, the correct answer is:

The character with the maximum value of P(yt∣yt−1,yt−2,…,y1)P(y_t| y_{t-1}, y_{t-2}, \ldots, y_1)P(yt∣yt−1,yt−2,…,y1).

You said:

Only give me correct options from next question

ChatGPT said:

ChatGPT

Understood. I'll provide only the correct option in response to the next question.

You said:

The advantage of YOLO: 00 It can be used for streaming video in real time It applies singe CNN for both classification and localization of object It is faster All of the above 12 14 15 19 16 17 18 20 Reset Save

ChatGPT said:

ChatGPT

All of the above

You said:

Which of the following is the limitation of YOLO? Question 7 14 15 16 17 18 19 20 None of the above 9 10 11 12 13 22 23 24 25 26 27 28 Reset Save 21 Cannot handle video stream Objects recognition is very slow as it takes more than one step YOLO can't handle objects

ChatGPT said:

ChatGPT

None of the above

You said:

Which component is not required in the last layer while building a multi-layer perceptron for a regression problem? 00 Question 8 Hidden layer Non-linear activation function Adaptive learning rate optimization algorithm Output layer 24 26 27 Reset

ChatGPT said:

ChatGPT

Non-linear activation function

You said:

Consider the given LSTM architecture: 08 Question 9 10 11 12 13 14 What is the value of the current cell state Cn? 9 15 16 [c(n-1)* f(n)] + [I(n) * i(n)] 19 20 21 [c(n-1) * P(n)] + [I(n) * i(n)] [c(n-1)* f(n)] + [I(n) * i(n)] + P(n) 23 24 25 [c(n-1)* f(n))+n) * i(n) * P(n)) 22 27 28 29 30 26 31 Reset

ChatGPT said:

ChatGPT

[c(n-1) * f(n)] + [I(n) * i(n)]

You said:

Intersection over Union (loU) loss function is suitable for: Question 10 10 11 12 13 14 15 16 Single-stage object detection Noisy data OGAN architecture Imbalanced data 9 19 20 24 Reset Save

ChatGPT said:

ChatGPT

Single-stage object detection

4o mini

You said:

Wide and deep learning model can be used for which of the following: Question 11 10 11 12 Generic large-scale regression and classification with sparse inputs Deep learning Data set has categorical features with large number of possible values 14 15 16 18 To build recommender system All of the above 19 20 Reset

ChatGPT said:

ChatGPT

All of the above

4o mini

You said:

Dropout is similar to which of the following technique: Question 12 10 11 12 14 Boosting Bagging K-Fold Cross-Validation 15 16 Leave One Out Cross-Validation 18 19 20 Reset

ChatGPT said:

ChatGPT

Bagging

4o mini

You said:

What is false about an input layer? 9 11 12 Question 13 N Input is fed to the network through this layer. It brings the initial data into the system for further processing by subsequent layers of artificial neurons. The input layer is the very beginning of the workflow for the artificial neural network. It is the rightmost layer in the architecture of a neural network. Reset

ChatGPT said:

ChatGPT

It is the rightmost layer in the architecture of a neural network.

4o mini

You said:

Darknet is an open-source framework used in Convolutional Neural network 14 15 16 17 18 19 20 OLSTM Recurrent neural Network None of the above 9 11 12 13 24 25 Reset

ChatGPT said:

ChatGPT

Convolutional Neural Network

4o mini

You said:

Learning for Developers 2 3 4 15 9 13 6 8 27 28 15 16 True False 10 11 12 14 Reset Save 17 18 19 20 21 22 23 24 26 Can you use the pre-trained model like VGG16 on medical images dataset?

ChatGPT said:

ChatGPT

True

4o mini

You said:

The Update gate of GRU performs the job of which gate(s) of LSTM? 9 10 11 12 Question 16 Input gate alone Forget gate alone Input and Forget gate, combined Forget and Output gate, combined Reset

ChatGPT said:

ChatGPT

Input and Forget gate, combined

4o mini

You said:

Learning For Developers 2 5 For the given RNN architecture, if n-24, How many times do the hidden layer(s) are iterated, counting one feedforward and BPTT as one count? 12 O24 01 Ο 48 2

ChatGPT said:

ChatGPT

4o mini

You said:

Which of the following activation functions does not have a mean value close to zero? Question 18 OELU ReLU LRELU OPRELU 20 18 19 22 23 27 28 26 Reset

ChatGPT said:

ChatGPT

ReLU

4o mini

You said:

Which of the following activation functions might end up creating dying neurons? Question 19 OELU ReLU OLRELU OPRELU

ChatGPT said:

ChatGPT

ReLU

4o mini

You said:

The loss of a model has initially started decreasing but after a few epochs the loss become constant. What can be a possible.reason? Question 20 10 10 14 12 The learning rate is low Regularization parameter is high Stuck at local minima 20 24 Reset All of the above. 13 15 16 17 18 19 Save

ChatGPT said:

ChatGPT

All of the above.

4o mini

You said:

Given two tasks A and B, from the following, in which case does transfer learning (from A to B) is recommended rather than learning from scratch? When both the tasks have different inputs. When you have a lot of data for task A than task B. ☐ When you have a lot of data for task B than task A. ☐ When low level features from task B could be helpful for learning task A. Reset

ChatGPT said:

ChatGPT

When you have a lot of data for task A than task B.

4o mini

You said:

Consider the given result attained using the 'lr_finder method on an image data. What optimum learning rate range should be chosen from the given graph? 11 12 15 16 0.001 to 0.01 0.0001 to 0.01 0.0001 to 0.001 24 0.001 to 0.1 19 20 23 28 Reset Save

ChatGPT said:

ChatGPT

0.001 to 0.01

4o mini

You said:

You are building a VGG16 network and have a starting code as presented below. What should be present in the blank space? base_image path flowers" ings dfpd._ (("img_path':os.listdir(base_image_path))) ImageDataGenerator(rescale-1./255).flow_from_dataframe(imgs_df, directory-base_image_path, x_col-'img_path',y_col-None, target_size-(224, 224), class_mode-None, batch_size=30) 7 8 11 12 15 16 19 20 3 24 DataFrame Series Table Panel

ChatGPT said:

ChatGPT

DataFrame

4o mini

You said:

Given the auto-encoder function for an RGB image, what should be the values of X, Y and Z: det auto encoder(img). Encoder module f_size 3 filter size p_size 1# pool size conv_1- Conv2D(32, (f_size, f_size), activation 'relu', padding-"same') (ing) pool_1 - MaxPooling2D(pool_size=(p_size, p_size)) (conv_1) conv_2 - Conv2D(64, (f_size, f_size), activation 'relu', padding'same') (pool_1) pool_2 MaxPooling2D(pool_size-(p_size, p_size)) (conv_2) conv 3- Conv2D(128, (f_size, f_size), activation 'relu', padding='same') (pool_2) #Decoder module conv 4 - Conv2D(X, (f_size, f_size), activation 'relu', padding='same') (conv_3) up_1- UpSampling2D ((p_size, p_size)) (conv_4) conv_5 - Conv2D(Y, (f_size, f_size), activation-'relu", padding'same') (up_1) up_2 UpSampling2D ((p_size, p_size)) (conv_5) decoded Conv2D(Z, (f_size, f_size), activation sigmoid', padding='same') (up_2) return decoded 0 12 16 7 2 20 24 28 20 33 OX: 3, Y:64, Z:128 OX: 32, Y:64, 2:128 OX: 128, Y:64, 2:32 OX: 128, Y:64, 2:3

ChatGPT said:

ChatGPT

OX: 128, Y: 64, Z: 3

4o mini

You said:

What is true about Auto-Encoder? It is an unsupervised ML algorithm It reduces the same loss function as PCA ☐ The neural network's target output is its input All of the above

ChatGPT said:

ChatGPT

All of the above

4o mini

You said:

While building a CNN model, which component is responsible for reducing the dimensionality? 7 8 Activation function Question 26 12 15 Random weights and bias 16 9 Max pooling Optimization algorithm 20 3 24 28 Reset

ChatGPT said:

ChatGPT

Max pooling

4o mini

You said:

Which network has overcome the translational and rotational problem of CNN? Generative Adversarial Networks Question 27 Restricted Boltzmann Machines Capsule Networks Deep Reinforcement Learning 27 28 9 30 31 32 13 Reset

ChatGPT said:

ChatGPT

Capsule Networks

4o mini

You said:

You are performing multivariate time series forecast using nine features with a window size of 50. What should be the value of argument 'input shape' while constructing the LST STM model? (9.50) 12 (6) O 16 (9, arange(50)) (9, range(50)) Save 24 28 32 Reset

ChatGPT said:

ChatGPT

(50, 9)

4o mini

You said:

In which of the given case does transfer learning fails? 00 7 Question 29 When the new domain is very different from the original domain (the domain where the original model has been trained). When the new domain does not provide enough number of samples to fine-tuning. When the new domain differs slightly from the original domain (the domain where the original model has been trained). Both a and b. Save 23 24 27 28 31 32 Reset

ChatGPT said:

ChatGPT

Both a and b.

4o mini

You said:

While setting up the dropout regularization in Keras, which of the following percentage is not acceptable? Ο 40% 50% 19 20 60% 9 Reset 30

ChatGPT said:

ChatGPT

60%

4o mini

You said:

You have to build a model that can fill in missing words in a paragraph. Would you go for unidirectional RNN or Bidirectional RNN? 7 8 1 12 Unidirectional RNN, since it is easier to train Bidirectional RNN, since it will give more context Both will perform equally well Neither, since RNNs are incapable of performing such tasks

ChatGPT said:

ChatGPT

Bidirectional RNN, since it will give more context

4o mini

You said:

While training the noisy data which of the following loss function is preferred: Bi-tempered loss function Mean Square error loss Cross entropy loss function Mean bias error 10 12 6 13 15 19 23 16 17 18 20 Salu 21 22 24 25 26 27 28 29 30 31 Reset

ChatGPT said:

ChatGPT

Bi-tempered loss function

4o mini

You said:

For a multi-layer perceptron, if you increase the number of hidden layers in the network, it can always help decrease the loss. True False 10 11 12 14 15 16 Reset

ChatGPT said:

ChatGPT

False

4o mini

Page updated

Google Sites

Report abuse