18.5 DISTINCT, TOP, and UNION
Now that you've learned enough to become dangerous with SQL, let's teach you a few more of the common commands that can be very useful. However, check with your instructor to be clear on which SQL commands that she or he wants to cover in this class.
The data file below contains actual Twitter data with a bunch of extra information about the importance of the "Tweeters." Each tweet is about a product called "Amazon Web Services" which is one of the most popular cloud computing platforms. Open the Access database and examine the fields. You may have already used this data file previously for an assignment. But in case it is new to you, this is what each field refers to:
- Fname: first name of the person who Tweeted
- Lname: last name of the person who Tweeted
- IsReshare: true = the Tweet was a "Reshare" of another Tweet
- RetweetCount: the number of times this particular Tweet (not the original if this is a retweet) has been retweeted
- Klout: a unique score representing the importance of the person who tweeted based on their volume of followers and historical likes and retweets
- Sentiment: the tone of the Tweet; higher numbers indicated a more positive tone; lower is more negative tone
- Country: the country where the Tweeter resides
- State: the state where the Tweeter resides
- City: the city where the Tweeter resides
- text: the body of the Tweet including links (NOTICE: this is real Twitter data so please do not be offended by the content)
Ok, ready to write some SQL? Create a new Query to answer each of the questions or return the data required below. The video solution is found along with each exercise.
-
Create a list of all retweets. Include only the Tweet. Order it by Tweet alphabetically descending
-
Create a list of all Tweeters. Order by Lname ascending and then Fname descending
-
What countries do the tweets originate from? Do not list duplicates and sort the countries alphabetically. HINT: use DISTINCT
- DISTINCT
-
Who has the top 10 Klout scores? Sort them by score descending. Return their names and scores. HINT: use TOP10
- TOP
-
Which tweets were retweeted at least 50 times? Give me their name, location country, state, city, the tweet itself, and the actual number of retweets
-
How many tweets came from each country?
-
What were the 20 tweets with the most negative sentiment? List the sentiment, tweet, Klout score, and the number of retweets that each got
-
Now show the 20 tweets with the most positive sentiment? List the sentiment, tweet, Klout score, and the number of retweets that each got. By the way, what do you learn from this? What is the effect of the tone of your tweet on the number of retweets that you get?
-
What is the average Klout score by country? Return the country and Klout score.
-
Show me a list of Tweeters from each country outside of the US with Klout scores between 60 and 80. Order the list by country.
-
Which state (in the US) is using twitter the most? Show me a list of the number of tweets sent from each state, starting with the most
-
What is the average retweet count for tweets that include a URL link versus those that don't?
- UNION