Tutorials

Learn how to use R.

What we need to do
1. Signup as a twitter developer
2. Navigate Twitter
2. Tweets into dataframe
3. process text with quanteda

Signup


  1. Go to Twitter Signup
  2. Include your phone in your profile.
  3. Verify your identy with Twitter using the texting app.
  4. go to Twitter App
  5. Click on create new app
    • name: must be unique
    • description: This is to use with R
    • website: “https://www.viu.ca
  6. click on the “key and access token” tab.
  7. Copy/paste “Consumer Key (API Key)” and “Consumer Secret (API Secret)” somewhere safe.
  8. Click “Create my access token”.
  9. Copy/paste “Access Token” and “Access Token Secret” somewhere safe

Install twitteR


install.packages( 'twitteR' )
library( 'twitteR' )
Setup account in R

We create four R objects from the key and access token generated by twitter.

Do not Copy/Paste those values

You need to use the values associated with YOUR Twitter account.

consumer_key    <- "IGWVbItQSpNI4WzjYzk4VvgKS" 
consumer_secret <- "2dkYOJn2NNuJbLyNS68CLdHDuU30Raw82zDnRZFwpWL6fietP0"
access_token    <- "198230874-W4G81SX3ApmTz551IQi5M55oKmOHz7LyTzUWzeIr"
access_secret   <- "MGRwEJ9LndHrQnDxmuFoV01lGRElx7YGSIKfRuDzIFsKwXEwBw"
Connect to Twitter API.
setup_twitter_oauth(consumer_key, consumer_secret, access_token, access_secret)
## [1] "Using direct authentication"

At R prompt: Answer: Enter “2” for “No”

Keywords


We search twitter for keyword ‘modi’. It will returns the latest tweets that contain the keyword.

e.g. Last 300 tweets about modi.

modi <- searchTwitter("modi", n=300)

Arrange them in a dataframe.

modi_df <- twListToDF(modi)
names(modi_df)
##  [1] "text"          "favorited"     "favoriteCount" "replyToSN"    
##  [5] "created"       "truncated"     "replyToSID"    "id"           
##  [9] "replyToUID"    "statusSource"  "screenName"    "retweetCount" 
## [13] "isRetweet"     "retweeted"     "longitude"     "latitude"
head(modi_df$text)
## [1] "RT @drkiritpsolanki: Retweeted Narendra Modi (@narendramodi):\n\nCongratulations @ManushiChhillar! India is proud of your accomplishment."   
## [2] "@ShashiTharoor It shows your narrow mindset and how low you can be to oppose Modi ji? It shows that Congress can go… https://t.co/wh6nwzXy2o"
## [3] "@ShashiTharoor The hatred of Tharoor shown again. Can we compare him like thar desert. Education is nothing mr Thar… https://t.co/cm1RLGpEfp"
## [4] "RT @jaiprakashshah2: Modi’s masterstroke!!! Afghan Leader says Afghanistan no longer dependent on Pakistan after opening of Chabahar Port h…"
## [5] "@sardesairajdeep What else can u expect from an intellectual lumpen like @AzmiShabana ?Both of u r on same page on ur hatred view of Modi."  
## [6] "RT @kiran_patniak: राफेल डील पर कांग्रेस के आरोप मोदी सरकार ने खारिज किये, कहा - अब सौदे पर झगड़ा सेना का अपमान!!~ खुद ही चोर खुद ही जज और…"

Timeline


With userTimeline(), we fetch the last n tweets from a specific twitter account. e.g. Get the last 300 tweets from Peter.

peter <- userTimeline("peterpsquare", n=300)

Arrange them in a dataframe.

peter_df <- twListToDF(peter)
head(peter_df$text)
## [1] "\xed\xa0\xbd\xed\xb1\x8a\xed\xa0\xbc\xed\xbf\xbe\xed\xa0\xbd\xed\xb8\x82\xed\xa0\xbd\xed\xb8\x82\xed\xa0\xbe\xed\xb4\xa3 abi oh! https://t.co/5Fy7x6KFOO"                                                                                                                                                                                                                                                  
## [2] "Grab a copy of today's @guardianlifeng, My Birthday Special. \n\nCreative Direction:… https://t.co/zvCFjSUXPJ"                                                                                                                                                                                                                                                
## [3] "Welcome to my world\xed\xa0\xbd\xed\xb1\x8d\xed\xa0\xbc\xed\xbf\xbe\xed\xa0\xbd\xed\xb4\xa5 https://t.co/RYnNm9oZeA"                                                                                                                                                                                                                                                                         
## [4] "Thanks so much guys for all the wishes,Love and support!\xed\xa0\xbc\xed\xbe\x8a\xed\xa0\xbc\xed\xbe\x82\xed\xa0\xbc\xed\xbe\x8a\xed\xa0\xbc\xed\xbe\x82\xed\xa0\xbc\xed\xbd\xbe\xed\xa0\xbc\xed\xbd\xbe\xed\xa0\xbc\xed\xbd\xbe\xed\xa0\xbc\xed\xbe\x8a\xed\xa0\xbc\xed\xbe\x82\xed\xa0\xbc\xed\xbd\xbe\xed\xa0\xbc\xed\xbe\x89\xed\xa0\xbc\xed\xbe\x89\xed\xa0\xbc\xed\xbe\x89\xed\xa0\xbc\xed\xbd\xbe\xed\xa0\xbc\xed\xbe\x8a\xed\xa0\xbc\xed\xbe\x82 #Lastnight #Birthday… https://t.co/BAhBGVxIKJ"
## [5] "\xed\xa0\xbc\xed\xbe\x82\xed\xa0\xbc\xed\xbd\xbe\xed\xa0\xbd\xed\xb4\xa5\xed\xa0\xbc\xed\xbe\x89\xed\xa0\xbc\xed\xbe\x8a\xed\xa0\xbc\xed\xbe\x81\xed\xa0\xbc\xed\xbe\x82\xed\xa0\xbc\xed\xbe\x8a\xed\xa0\xbc\xed\xbe\x81\xed\xa0\xbc\xed\xbd\xbe\xed\xa0\xbc\xed\xbd\xbe\xed\xa0\xbc\xed\xbd\xbe\xed\xa0\xbc\xed\xbe\x81\xed\xa0\xbc\xed\xbe\x8a\xed\xa0\xbc\xed\xbe\x82 #Birthday Music in the background is Out! Audio and Video Link in my… https://t.co/dlnw5Ir6SX"                         
## [6] "BRAND NEW! For my Head Audio/Video Out!!! Link on my bio \xed\xa0\xbd\xed\xb4\xa5\xed\xa0\xbd\xed\xb4\xa5\xed\xa0\xbd\xed\xb4\xa5\xed\xa0\xbd\xed\xb4\xa5\xed\xa0\xbd\xed\xb4\xa5\xed\xa0\xbd\xed\xb4\xa5\xed\xa0\xbd\xed\xb4\xa5\xed\xa0\xbd\xed\xb4\xa5\xed\xa0\xbd\xed\xb4\xa5\xed\xa0\xbd\xed\xb4\xa5\xed\xa0\xbd\xed\xb4\xa5\xed\xa0\xbd\xed\xb4\xa5\xed\xa0\xbd\xed\xb4\xa5\xed\xa0\xbd\xed\xb4\xa5\xed\xa0\xbd\xed\xb4\xa5 #ForMyHead\xed\xa0\xbd\xed\xb9\x86\xed\xa0\xbc\xed\xbf\xbf‍♂️… https://t.co/yC5MZaeipb"

Favorites


With favorites(), we get the last n tweets that were liked by a specific twitter account.

e.g. Last 200 tweets that were ‘liked’ by Katy Perry.

katy_fav <- favorites("katyperry", n=200)

# Arrange the favorites in a dataframe.

katy_df <- twListToDF(katy_fav)
Who does Katy like?

First we create a list of liked (favorited) tweets sorted by frequency.

# Increase margin size so that the whole label can be displayed.
par(mar=c(4,9,4,2))
name_freq <- sort( table( katy_df$screenName ), decreasing = TRUE )
barplot( name_freq[1:6], horiz=TRUE, las=2 , main="Katy's favourites.")

When is Donald most active?

donald <- userTimeline("realDonaldTrump", n=300)
don_df <- twListToDF(donald)

# Add a days of the week variable to the dataframe

don_df$weekday <- weekdays( don_df$created )

# Tabulate and plot the week days.
# Increase margin size so that the whole label can be displayed.
par(mar=c(4,9,4,2))
order_days <- rev(c('Monday','Tuesday','Wednesday','Thursday','Friday','Saturday',
            'Sunday'))
barplot( table( don_df$weekday )[order_days], horiz=TRUE, 
        las=2, main="Donald's tweeting habits." )

Quanteda


Tweets with quanteda

Remove Tweet symbols # @

library( 'quanteda' )
mystops <- c('co','https','http','t','will','&amp','amp','thank','t.co')
d_dfm<-dfm(don_df$text,
            remove_punct = TRUE,
            remove_twitter = TRUE,
            remove = c(stopwords('english'), mystops))
m_dfm<-dfm(modi_df$text,
            remove_punct = TRUE,
            remove_twitter = TRUE,
            remove = c(stopwords('english'), mystops, 'modi','rt'))
p_dfm<-dfm(peter_df$text,
            remove_punct = TRUE,
            remove_twitter = TRUE,
            remove = c(stopwords('english'), mystops))

Visualize


Visualize the findings.

textplot_wordcloud(d_dfm,random.order=FALSE,min.freq=3,max.words=300,   
    colors=c('blue','dodgerblue','gray'), main="Trump")

textplot_wordcloud(m_dfm,random.order=FALSE,min.freq=3,max.words=300,   
    colors=c('blue','dodgerblue','gray'), main="Modi")