Text Mining with R Notes

Jan 25, 2024 2 min read R

Textbook

Text Mining with R

Tutorial

juliasilge.github.io/tidytext/
Sentiment Analysis: Introduction to the Syuzhet Package

Definitons

A token is a meaningful unit of text, most often a word, that we are interested in using for further analysis, and tokenization is the process of splitting text into tokens.

Key functions

unnest_tokens(): do tokenization and get one-word-per-row format.
anti_join(get_stopwords()): We can remove stop words (accessible in a tidy form with the function get_stopwords()) with an anti_join.

R code

# Loading necessary libraries
library(sentimentr)
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(magrittr)

# Example text
mytext <- c("The phone has scratches.", "The phone has no scratches.")

# Converting text into sentences
mytext <- get_sentences(mytext)

# Performing sentiment analysis
sentiment(mytext)

##    element_id sentence_id word_count  sentiment
## 1:          1           1          4 -0.3000000
## 2:          2           1          5  0.2683282

library(syuzhet)

## 
## Attaching package: 'syuzhet'

## The following object is masked from 'package:sentimentr':
## 
##     get_sentences

# Example sentences
sentences <- c("The phone has scratches.", "The phone has no scratches.")

# Get sentiment scores
sentiment_scores <- get_nrc_sentiment(sentences)

# View scores
sentiment_scores

##   anger anticipation disgust fear joy sadness surprise trust negative positive
## 1     0            0       0    0   0       0        0     0        0        0
## 2     0            0       0    0   0       0        0     0        0        0

# not work well

text mining

Chen Xing

Founder & Data Scientist

Enjoy Life & Enjoy Work!

Text Mining with R Notes

Textbook

Tutorial

Definitons

Key functions

R code

Chen Xing

Founder & Data Scientist

Related