« www.tkk.fi

Suomeksi | In English

Skip to main content

Text and Program Analysis (TAPAS)

This is the home page of the TAPAS consortium which concentrates on developing methods and tools for free text and program text analysis.

TAPAS is a collaborative project between the Laboratory of Software Technology at the Helsinki University of Technology (HUT) and the Department of Computer Science at the University of Joensuu (UJ). The aim is to develop novel assessment methods by combining the strengths of the educational technology research groups from the two universities. The main goals are:

  • to develop advanced methods and tools for general purpose text evaluation,
  • to develop methods and tools to automatically recognize algorithms, and
  • to evaluate the new methods with large groups of students.

The first part, development of free text analysis methods, will be carried out at UJ. That includes methods for content and structural analysis of texts, semi-automatic features such as automated feedback and summary generation, and versatile measures to evaluate the quality of text, such as structure, coherence and cohesion.

The second part, program analysis, will be carried out at HUT and concentrates on recognizing algorithm, as a subfield of Program Comprehension. In algorithm recognition, the problem is to identify a given arbitrary algorithm implemented in a programming language and recognize it as belonging to a particular group of algorithms. Among others applications, the method can be used in automatic students' submission assessments. The method will allow us to check what algorithm is used in a submission; a feature that is not provided by current automatic assessment tools. The method that will be developed in the research project is based on static analysis of program code including various statistics of language constructs and analysis of Roles of Variables in the target program. The method includes data mining approaches, such as clustering, as well as building the ontology of the supported algorithms and applying schema matching.

The scientific output of the project will include improved methodology for algorithm and free text analysis, including recognizing algorithm, detecting coverage of relevant topics in essays, automated summary generation and structural analysis of texts. The outcomes of the evaluation studies will indicate how formative automatic feedback for students as well as assessment aids for teachers could be applied in practice. All scientific results will be reported on papers in international workshops, conferences and journals.

The practical results of the project will include general purpose tools for recognizing algorithms as well as automatic analysis and feedback on free text. They will be tested extensively on large scale courses, which means they are readily available for all participants of the research project and external partners.


The partners of the consortium are the COMPSER research group at HUT, and the EdTech research group at UJ.

The laboratory of Software Technology at HUT has extensively applied automatic assessment tools and methods in programming education. The goal has been to set up more exercises with detailed feedback to students, and thus support learning on large scale courses with hundreds of students. The department of computer science at UJ has been deve5Bloping an automatic assessment system to evaluate essays written in Finnish and to give automatic feedback about them.

This research project is funded by Academy of Finland (2006-2009).