Quantitative Text Analysis and Evaluating Lexical Style in R

제공자:
Coursera Project Network
학습자는 이 안내 프로젝트에서 다음을 수행하게 됩니다.

tokenize text documents to examine top words by frequency

examine the change in type to token ratio or level of text complexity over time

Clock1 hour
Beginner초급
Cloud다운로드 필요 없음
Video분할 화면 동영상
Comment Dots영어
Laptop데스크톱 전용

By the end of this project, you will learn about the concept of lexical style in textual analysis in R. You will know how to load and pre-process a data set of text documents by converting the data set into a corpus and document feature matrix. You will know how to calculate the type to token ration which evaluates the level of complexity of a text, and know how to isolate terms of particular lexical interest in a text and visualize the variation in frequency of such terms in texts over time.

개발할 기술

  • Descriptive Analysis
  • Text Analysis
  • Data Wrangling
  • Data Visualization (DataViz)
  • Text Corpus

단계별 학습

작업 영역이 있는 분할 화면으로 재생되는 동영상에서 강사는 다음을 단계별로 안내합니다.

  1. Load textual data into R and turn it into a corpus object and understand the concept of lexical style in textual analysis

  2. Extract meta-data from text document filenames and calculate the type to token ratio (TTR)

  3. Examine the change in the type to token ratio or level of text complexity over time

  4. Tokenize text documents to examine top words by frequency of appearance and isolate words of particular lexical interest in the text

  5. Visualize the change in the variation in the frequency of features of particular lexical interest in your text

안내형 프로젝트 진행 방식

작업 영역은 브라우저에 바로 로드되는 클라우드 데스크톱으로, 다운로드할 필요가 없습니다.

분할 화면 동영상에서 강사가 프로젝트를 단계별로 안내해 줍니다.

자주 묻는 질문

자주 묻는 질문

궁금한 점이 더 있으신가요? 학습자 도움말 센터를 방문해 보세요.