1)
For each problem below, determine whether Text Classification or Topic Analysis is more appropriate to solve it.
Group of answer choices
1. Partition the reviews of a company’s flagship product into homogenous clusters
[ Choose ] Text Classification Topic Analysis
2. Automatically label an email as work-related or not
[ Choose ] Text Classification Topic Analysis
3. Predicting whether an essay written by a college applicant if of low quality or high quality
[ Choose ] Text Classification Topic Analysis
4. Given a set of essays written by college applicants, group them by topic
[ Choose ] Text Classification Topic Analysis
2) In the context of topic analysis, which matrix is the most appropriate to describe each topic?
The topic-term matrix
The document-term matrix
The document-topic matrix
3)
Among the classification tasks below, for which is logistic regression better suited than Naïve Bayes?
Group of answer choices
Find which tweets are written by a native English speaker vs a non-native English speaker
Classify emails as sad, happy, or neutral
Classify emails as important or not important
Predict which emails are spam
4)
In Wikipedia, an “edit-war” is said to take place when two contributors revert each other’s edits multiple times. Here is an example of edit war:
Version | Text |
---|---|
Version 1 | The tree is tall |
Version 2 | The shrub is short |
Version 3 | The tree is tall |
Version 4 | The shrub is short |
Complete the following edit matrix corresponding to the edit war above. The first row has already been completed. Be sure to enter your answers carefully. Format your numbers as in the following examples: 1, -1, 2, -2, etcetera.
Tree | Shrub | Tall | Short | |
---|---|---|---|---|
Edit 1 (ver1?ver2) | -1 | 1 | -1 | 1 |
Edit 2 (ver2?ver3) | ||||
Edit 3 (ver3?ver4) |