Which ONE of the following options does NOT describe a challenge for acquiring test data in ML systems?
SELECT ONE OPTION
Challenges for Acquiring Test Data in ML Systems: Compliance needs, the changing nature of data over time, and sourcing data from public sources are significant challenges. Data being generated quickly is generally not a challenge; it can actually be beneficial as it provides more data for training and testing.
Reference: ISTQB_CT-AI_Syllabus_v1.0, Sections on Data Preparation and Data Quality Issues.
A wildlife conservation group would like to use a neural network to classify images of different animals. The algorithm is going to be used on a social media platform to automatically pick out pictures of the chosen animal of the month. This month's animal is set to be a wolf. The test team has already observed that the algorithm could classify a picture of a dog as being a wolf because of the similar characteristics between dogs and wolves. To handle such instances, the team is planning to train the model with additional images of wolves and dogs so that the model is able to better differentiate between the two.
What test method should you use to verify that the model has improved after the additional training?
Back-to-back testing is used to compare two different versions of an ML model, which is precisely what is needed in this scenario.
The model initially misclassified dogs as wolves due to feature similarities.
The test team retrains the model with additional images of dogs and wolves.
The best way to verify whether this additional training improved classification accuracy is to compare the original model's output with the newly trained model's output using the same test dataset.
Why Other Options Are Incorrect:
A (Metamorphic Testing): Metamorphic testing is useful for generating new test cases based on existing ones but does not directly compare different model versions.
B (Adversarial Testing): Adversarial testing is used to check how robust a model is against maliciously perturbed inputs, not to verify training effectiveness.
C (Pairwise Testing): Pairwise testing is a combinatorial technique for reducing the number of test cases by focusing on key variable interactions, not for validating model improvements.
Supporting Reference from ISTQB Certified Tester AI Testing Study Guide:
ISTQB CT-AI Syllabus (Section 9.3: Back-to-Back Testing)
'Back-to-back testing is used when an updated ML model needs to be compared against a previous version to confirm that it performs better or as expected'.
'The results of the newly trained model are compared with those of the prior version to ensure that changes did not negatively impact performance'.
Conclusion:
To verify that the model's performance improved after retraining, back-to-back testing is the most appropriate method as it compares both model versions. Hence, the correct answer is D.
Which of the following is an example of a clustering problem that can be resolved by unsupervised learning?
Clustering is a form of unsupervised learning, which groups data points based on similarities without predefined labels. According to ISTQB CT-AI Syllabus, clustering is used in scenarios where:
The objective is to find natural groupings in data.
The dataset does not have labeled outputs.
Patterns and structures need to be identified automatically.
Analyzing the answer choices:
A . Associating shoppers with their shopping tendencies Correct
Shoppers can be grouped based on purchasing behaviors (e.g., luxury shoppers vs. budget-conscious shoppers), which is a typical clustering application in market segmentation.
B . Grouping individual fish together based on their types of fins Incorrect
If the types of fins are labeled, it becomes a classification problem, which requires supervised learning.
C . Classifying muffin purchases based on packaging attractiveness Incorrect
Classification, not clustering, because attractiveness scores or labels must be predefined.
D . Estimating the expected purchase of cat food after an ad campaign Incorrect
This is a prediction task, best suited for regression models, which are part of supervised learning.
Thus, Option A is the best answer, as clustering is used to group shoppers based on tendencies without predefined labels.
Certified Tester AI Testing Study Guide Reference:
ISTQB CT-AI Syllabus v1.0, Section 3.1.2 (Unsupervised Learning - Clustering and Association)
ISTQB CT-AI Syllabus v1.0, Section 3.3 (Selecting a Form of ML - Clustering).
Which of the following is correct regarding the layers of a deep neural network?
A deep neural network (DNN) is a type of artificial neural network that consists of multiple layers between the input and output layers. The ISTQB Certified Tester AI Testing (CT-AI) Syllabus outlines the following characteristics of a DNN:
Structure of a Deep Neural Network:
A DNN comprises at least three types of layers:
Input layer: Receives the input data.
Hidden layers: Perform complex feature extraction and transformations.
Output layer: Produces the final prediction or classification.
Analysis of Answer Choices:
A (Only input and output layers) Incorrect, as a DNN must have at least one hidden layer.
B (At least one internal hidden layer) Correct, as a neural network must have hidden layers to be considered deep.
C (Minimum of five layers required) Incorrect, as there is no strict definition that requires at least five layers.
D (Output layer is not connected to other layers) Incorrect, as the output layer must be connected to the hidden layers.
Thus, Option B is the correct answer, as a deep neural network must have at least one hidden layer.
Certified Tester AI Testing Study Guide Reference:
ISTQB CT-AI Syllabus v1.0, Section 6.1 (Neural Networks and Deep Neural Networks)
ISTQB CT-AI Syllabus v1.0, Section 6.2 (Structure of Deep Neural Networks).
Which of the following is a dataset issue that can be resolved using pre-processing?
Pre-processing is an essential step in data preparation that ensures data is clean, formatted correctly, and structured for effective machine learning (ML) model training. One common issue that can be resolved during pre-processing is numbers stored as strings.
Explanation of Answer Choices:
Option A: Insufficient data
Incorrect. Pre-processing cannot resolve insufficient data. If data is lacking, techniques like data augmentation or external data collection are needed.
Option B: Invalid data
Incorrect. While pre-processing can identify and handle some forms of invalid data (e.g., missing values, duplicate entries), it does not resolve all invalid data issues. Some cases may require domain expertise to determine validity.
Option C: Wanted outliers
Incorrect. Pre-processing usually focuses on handling unwanted outliers. Wanted outliers may need to be preserved, which is more of a data selection decision rather than pre-processing.
Option D: Numbers stored as strings
Correct. One of the key functions of data pre-processing is data transformation, which includes converting incorrectly formatted data types, such as numbers stored as strings, into their correct numerical format.
ISTQB CT-AI Syllabus Reference:
Data Pre-Processing Steps: 'Transformation: The format of the given data is changed (e.g., breaking an address held as a string into its constituent parts, dropping a field holding a random identifier, converting categorical data into numerical data, changing image formats)'.
Rikki
29 days agoMila
1 months agoEzekiel
1 months agoKattie
2 months agoLawrence
2 months agoEdelmira
2 months agoTimothy
3 months agoChantay
3 months agoMartina
3 months agoHelene
4 months agoDevon
4 months agoMerilyn
4 months agoMargarita
5 months agoMarvel
5 months agoAn
5 months agoJerry
6 months agoTemeka
6 months agoLatrice
6 months agoNguyet
6 months agoCatarina
6 months agoLai
7 months agoLashaunda
7 months agoGail
7 months agoCheryl
7 months agoSharita
8 months agoLynette
8 months agoJaney
8 months agoCeleste
8 months agoSantos
9 months agoEdmond
10 months agoMariko
10 months agoRachael
10 months agoBernadine
11 months agoDallas
11 months agoShanda
11 months agoVallie
12 months ago