Practice Questions for Data Science

1. What is the primary goal of Data Science?

(a) Data Entry (b) Extracting insights from data (c) Creating databases (d) Designing websites

2. Which of the following is NOT a part of Data Science?

(a) Machine Learning (b) Statistics (c) Web Development (d) Data Visualization

3. What is supervised learning in Machine Learning?

(a) Learning without labeled data (b) Learning from labeled data (c) Learning without algorithms (d) Learning from predefined rules

4. Which library is commonly used for data manipulation in Python?

(a) NumPy (b) Pandas (c) Matplotlib (d) Scikit-learn

5. What does ‘NaN’ stand for in Data Science?

(a) Not a Number (b) No allocated Number (c) New artificial Number (d) Numeric and Nominal

6. Which of the following is NOT a type of Machine Learning?

(a) Supervised Learning (b) Reinforcement Learning (c) Associative Learning (d) Unsupervised Learning

7. What is the purpose of an activation function in a neural network?

(a) To connect layers (b) To introduce non-linearity (c) To initialize weights (d) To create hidden layers

8. Which technique is used to handle missing data in a dataset?

(a) Removing rows (b) Imputation (c) Both A and B (d) Ignoring missing values

9. Which programming language is most commonly used in Data Science?

(a) C++ (b) Java (c) Python (d) Swift

10. What does ‘overfitting’ mean in Machine Learning?

(a) The model performs well on both training and test data (b) The model learns patterns from noise in the training data (c) The model has too few parameters (d) The model performs poorly on training data

11. What is the purpose of cross-validation in Machine Learning?

(a) To increase model complexity (b) To improve computation speed (c) To estimate model performance on unseen data (d) To remove outliers

12. Which of the following is an unsupervised learning algorithm?

(a) Decision Tree (b) Linear Regression (c) K-Means Clustering (d) Random Forest

13. What does the term ‘Big Data’ refer to?

(a) Large-sized Excel files (b) Extremely large and complex datasets (c) Fast internet speed (d) High-resolution images

14. Which statistical measure is used to find the middle value in a dataset?

(a) Mean (b) Median (c) Mode (d) Variance

15. What is the primary purpose of dimensionality reduction?

(a) To increase dataset size (b) To remove irrelevant features (c) To slow down computations (d) To add new variables

16. Which of the following is a performance metric for classification models?

(a) Mean Absolute Error (b) Precision (c) Root Mean Squared Error (d) Sum of Squares

17. In a normal distribution, what percentage of data falls within one standard deviation from the mean?

(a) 50% (b) 68% (c) 95% (d) 99%

18. What is the main purpose of feature scaling?

(a) To convert categorical data into numerical form (b) To make features comparable in scale (c) To increase data size (d) To remove outliers

19. What is the key difference between classification and regression?

(a) Classification predicts categories, regression predicts continuous values (b) Classification is supervised, regression is unsupervised (c) Regression works only with neural networks (d) Classification requires deep learning

20. Which algorithm is best suited for text classification tasks?

(a) K-Means (b) Naïve Bayes (c) Principal Component Analysis (d) K-Nearest Neighbors

21. What is overfitting in machine learning?

(a) A model that performs well on the training data but poorly on unseen data (b) A model that performs well on both training and testing data (c) A model that uses only a small amount of training data (d) A model that doesn't generalize well

22. What is a confusion matrix in machine learning?

(a) A table used to evaluate the performance of a classification algorithm (b) A table used to evaluate the performance of a regression model (c) A method for calculating the accuracy of a model (d) A tool to visualize decision boundaries

23. What is the purpose of the "learning rate" in machine learning?

(a) It controls the amount of time the model takes to train (b) It determines the speed at which the model learns (c) It helps to avoid overfitting by reducing the model complexity (d) It increases the variance of the model's predictions

24. What is the difference between supervised and unsupervised learning?

(a) Supervised learning requires labeled data, unsupervised learning does not (b) Supervised learning works only with classification tasks, unsupervised learning with regression (c) Supervised learning works with unsorted data, unsupervised learning with labeled data (d) Both supervised and unsupervised learning require labeled data

25. What is a decision tree in machine learning?

(a) A model that predicts outcomes based on hierarchical data splits (b) A model that performs linear regression (c) A type of clustering algorithm (d) A model used only for classification problems

26. What is cross-validation in machine learning?

(a) A method used to prevent overfitting by dividing the dataset into multiple parts (b) A technique for increasing the model's complexity (c) A method of optimizing hyperparameters in a model (d) A method to train a model faster

27. What is feature engineering?

(a) The process of selecting, modifying, or creating new features from raw data (b) The process of splitting data into training and testing sets (c) The process of reducing the dimensionality of the data (d) The process of visualizing data

28. What is the purpose of normalization in machine learning?

(a) To scale features to a standard range to improve model accuracy (b) To reduce the complexity of the model (c) To increase the size of the dataset (d) To eliminate categorical features from the data

29. What is a neural network?

(a) A series of algorithms that attempt to recognize underlying relationships in a set of data (b) A type of clustering algorithm (c) A type of decision tree (d) A linear regression model

30. What is a random forest in machine learning?

(a) A collection of decision trees that work together for more accurate predictions (b) A type of neural network (c) A type of clustering algorithm (d) A type of regression model

31. What is the purpose of regularization in machine learning?

(a) To reduce overfitting by penalizing large model coefficients (b) To increase model complexity (c) To reduce the size of the dataset (d) To increase the learning rate

32. What is the "bias-variance tradeoff" in machine learning?

(a) The balance between underfitting and overfitting (b) The tradeoff between model size and accuracy (c) The tradeoff between feature selection and feature engineering (d) The tradeoff between model complexity and dataset size

33. What is the purpose of the "activation function" in a neural network?

(a) To introduce non-linearity to the model (b) To optimize the weights during training (c) To prevent overfitting (d) To calculate the loss function

34. What is "gradient descent" in machine learning?

(a) A method for optimizing a model by adjusting weights to minimize the loss function (b) A method for scaling the dataset (c) A technique for regularizing models (d) A method to increase the learning rate

35. What is the purpose of cross-entropy loss in classification problems?

(a) To measure the difference between predicted and true probabilities (b) To minimize overfitting (c) To calculate the accuracy of the model (d) To improve the model's interpretability

36. What is the difference between L1 and L2 regularization?

(a) L1 regularization adds penalties for the absolute values of coefficients, L2 adds penalties for the squared values (b) L1 regularization is more prone to overfitting than L2 regularization (c) L1 regularization is used only in decision trees (d) L2 regularization removes features, while L1 regularization adds them

37. What is the difference between bagging and boosting?

(a) Bagging combines multiple models independently, boosting combines them sequentially (b) Bagging improves performance on imbalanced datasets, boosting does not (c) Boosting is used for regression tasks, bagging for classification tasks (d) Bagging and boosting are both types of unsupervised learning algorithms

38. What is dimensionality reduction?

(a) A process of reducing the number of features in a dataset while retaining important information (b) A technique to increase the complexity of a model (c) A method to increase the number of features in a dataset (d) A technique to optimize the data for machine learning algorithms

39. What is feature scaling?

(a) A technique used to standardize or normalize features in a dataset to improve model performance (b) A method to reduce the number of features (c) A technique to increase the dataset size (d) A technique to improve the interpretability of the model

40. What is "ensemble learning" in machine learning?

(a) A method that combines multiple models to improve prediction accuracy (b) A technique to reduce the size of the dataset (c) A method for optimizing the learning rate (d) A type of supervised learning

41. What is the "curse of dimensionality" in machine learning?

(a) The problem of too many features in a dataset causing models to perform poorly (b) The difficulty of handling large datasets (c) The increase in data storage requirements as the number of features increases (d) The problem of models becoming too complex to interpret

42. What is the difference between supervised and unsupervised learning?

(a) Supervised learning uses labeled data, while unsupervised learning uses unlabeled data (b) Supervised learning is used for regression tasks, while unsupervised learning is for classification tasks (c) Supervised learning is used only for deep learning, unsupervised for decision trees (d) There is no difference between the two

43. What is overfitting in machine learning?

(a) When a model learns the details of the training data too well, leading to poor performance on new data (b) When a model underperforms on both training and test data (c) When a model performs well only on test data (d) When a model ignores important features in the dataset

44. What is the purpose of the "learning rate" in a machine learning model?

(a) To control the speed at which the model learns during training (b) To define the size of the dataset used for training (c) To decide the number of layers in a neural network (d) To adjust the number of features in a dataset

45. What is the purpose of dropout in neural networks?

(a) To reduce overfitting by randomly "dropping" units during training (b) To increase the complexity of the network (c) To speed up the training process (d) To decrease the size of the dataset

46. What is a confusion matrix in machine learning?

(a) A table used to evaluate the performance of classification algorithms (b) A plot used to visualize the training data (c) A metric used to measure the variance in predictions (d) A technique used to prevent model overfitting

47. What is the purpose of feature engineering in machine learning?

(a) To extract important information from raw data for model training (b) To clean the data by removing noisy values (c) To visualize the results of the model (d) To test the model's accuracy

48. What is the "bias-variance tradeoff" in machine learning?

(a) The balance between underfitting and overfitting (b) The relationship between training and test data accuracy (c) The need for more data in deep learning models (d) The problem of irrelevant features in the dataset

49. What is the difference between bagging and boosting?

(a) Bagging combines models independently, while boosting combines models sequentially (b) Bagging gives more weight to misclassified data points, while boosting does not (c) Bagging is used for regression, while boosting is used for classification (d) There is no difference between the two

50. What is the purpose of the "exploration phase" in data science?

(a) To understand and prepare the data before modeling (b) To evaluate the model performance after training (c) To tune the hyperparameters of the model (d) To collect new data for training

Practice Questions For

Data Science MCQ

1. What is the primary goal of Data Science?

2. Which of the following is NOT a part of Data Science?

3. What is supervised learning in Machine Learning?

4. Which library is commonly used for data manipulation in Python?

5. What does ‘NaN’ stand for in Data Science?

6. Which of the following is NOT a type of Machine Learning?

7. What is the purpose of an activation function in a neural network?

8. Which technique is used to handle missing data in a dataset?

9. Which programming language is most commonly used in Data Science?

10. What does ‘overfitting’ mean in Machine Learning?

11. What is the purpose of cross-validation in Machine Learning?

12. Which of the following is an unsupervised learning algorithm?

13. What does the term ‘Big Data’ refer to?

14. Which statistical measure is used to find the middle value in a dataset?

15. What is the primary purpose of dimensionality reduction?

16. Which of the following is a performance metric for classification models?

17. In a normal distribution, what percentage of data falls within one standard deviation from the mean?

18. What is the main purpose of feature scaling?

19. What is the key difference between classification and regression?

20. Which algorithm is best suited for text classification tasks?

21. What is overfitting in machine learning?

22. What is a confusion matrix in machine learning?

23. What is the purpose of the "learning rate" in machine learning?

24. What is the difference between supervised and unsupervised learning?

25. What is a decision tree in machine learning?

26. What is cross-validation in machine learning?

27. What is feature engineering?

28. What is the purpose of normalization in machine learning?

29. What is a neural network?

30. What is a random forest in machine learning?

31. What is the purpose of regularization in machine learning?

32. What is the "bias-variance tradeoff" in machine learning?

33. What is the purpose of the "activation function" in a neural network?

34. What is "gradient descent" in machine learning?

35. What is the purpose of cross-entropy loss in classification problems?

36. What is the difference between L1 and L2 regularization?

37. What is the difference between bagging and boosting?

38. What is dimensionality reduction?

39. What is feature scaling?

40. What is "ensemble learning" in machine learning?

41. What is the "curse of dimensionality" in machine learning?

42. What is the difference between supervised and unsupervised learning?

43. What is overfitting in machine learning?

44. What is the purpose of the "learning rate" in a machine learning model?

45. What is the purpose of dropout in neural networks?

46. What is a confusion matrix in machine learning?

47. What is the purpose of feature engineering in machine learning?

48. What is the "bias-variance tradeoff" in machine learning?

49. What is the difference between bagging and boosting?

50. What is the purpose of the "exploration phase" in data science?

Computer Sciense Short Questions

1. What is an algorithm?

2. What is Big-O notation?

3. What is a binary search?

4. What is a linked list?

5. What is a stack?

6. What is a queue?

7. What is a tree in computer science?

8. What is a graph?

9. What is a hash table?

10. What is a neural network?

11. What is a stack?

12. What is a queue?

13. What is a binary tree?

14. What is a linked list?

15. What is dynamic programming?

16. What is the difference between a deep copy and a shallow copy?

17. What is the purpose of the "this" keyword in Java?

18. What is polymorphism in object-oriented programming?

19. What is encapsulation in object-oriented programming?

20. What is inheritance in object-oriented programming?

21. What is an abstract class in Java?

22. What is an interface in Java?

23. What is a constructor in Java?

24. What is the difference between a static and a non-static method?

25. What is the difference between == and .equals() in Java?

26. What is a thread in Java?

27. What is synchronization in Java?

FULL STACK JAVA
PROGRAMING