TEXT PROCESSING - Course Description

Course Description

이 교과목은 최신 테이터 의존 환경에서 텍스트 처리와 관련된 업무를 수행하는 데 필요한 지식과 기술을 학습하는 과목이다. 텍스트 처리는 데이터베이스를 제공하는 기본적인 query와 달리, 데이터와 텍스트를 제3자가 제공하는 도구를 이용하여 텍스트 추출, 필터링 및 변환, 데이터 의존 응용프로그램 계발까지의 과정에서 초기 단계의 필수 요소이다. 이 과목에서 학습자들은 LINUX 환경에서 제공하는 grep, sed, and awk 같은 텍스트 처리 도구들을 배운다. 또한 정규 표현식 (regular expression)같은 유용한 문자열의 집합을 표현하는 형식들을 배운다. 결과적으로 서지학 구조에서 이용된 텍스트를 처리함에 따라서 여러 가지 서지 데이터 및 텍스트 처리방법과 응용가능성을 실질적으로 경험하게 된다..

Course Objectives

This course aim to teach students to be proficient in text processing related tasks. After the completion of this course, students should be able to proficient in using text processing related Linux commands and should understand the concepts such as regular expression.

Teachnig Method

Attendance is critical as class sessions are utilized to introduce some of difficult concepts. This course will be taught in English.

Textbook

Assessment

Requiments

There is no pre-requiste course.

Practical application of the course

The text processing tools introduced in this course can be applied to many practical situations. Additionally, although the focus of this course is on text processing, the programming aspects covered in this course are considered as a foundational skill. In this regard, students are expected to utilize the concepts learned in this course to future programming scenarios. Furthermore, this course is expected as a stepping stone in learning other programming languages as many fundamental elements of programming concepts can be found in this course. At the end of this course, students should be able to write a script (program) to extract certain parts of text based on a simple recognizable pattern found in the text.

Reference