Computational Morphology
Tentative course schedule
- 09 April -- introduction of the course; What is computational morphology? PDF
- 16 April -- finite state automata (FSA); finite state transducers (FST); PDF
- 23 April -- weighted FSAs and FSTs; important algorithms; PDF
- 30 April -- test on FSAs and FSTs; Test; solution for problem 1, solution for problem 2, solutions for problems 3 & 4
- 7 May -- morphological operations; slides by Kaja
- 14 May -- No class: Campusmesse!
- 21 May -- morphological operations (continuation); slides by Kaja
- 28 May -- working with a problem;
- 04 June -- morphological operations (continuation), test example problem; slides, problem: table 1, problem: table 2, description of the problem is at the end of the page.
- 11 June -- Sports day: no class due to the storm!
- 18 June -- No class! There will be no test and also no class on June 18. Instead of this the problem from the class on June 4 (the arabic one with numbers) becomes you homework that you should either send me per email or bring to the class on June 25.
- 25 June -- problem solving; notes by Anna
- 2 July -- machine learning techniques for morphology; PDF
- 9 July -- third test;
- 16 July -- discussion of the test/homework, conclusion remarks.
Grading
Attendance:
- I don't care about attendance itself;
- if you attend, please participate;
- if you want to do something else -- do it not in class;
Active participation:
- comments and questions during the class;
- answering questions (even incorrectly, does not matter);
- brings you 20 points;
Tests/Assignments:
- FSA and FST test: 20 points;
- June 11: test or assignment on the analysis of some new data with methods learned; 20 points;
- July 9: last test; 40 points.
Grades:
- for a BN you need 50 points;
- AP: there are two ways to receive an AP:
- you do all the tests during the semester and the last (big) test is your official AP exam (40% of your grade);
- you do a separate written exam (similar to the three tests) in late summer/early spring (date will be announced) with a maximum of 100 points.
- Grade/points correspondence:
- 1.0: 95 -- 100
- 1.3: 91 -- 94
- 1.7: 87 -- 90
- 2.0: 83 -- 86
- 2.3: 80 -- 82
- 2.7: 75 -- 79
- 3.0: 70 -- 74
- 3.3: 65 -- 69
- 3.7: 60 -- 65
- 4.0: 50 -- 59
Arabic problem:
You are given a list of arabic words in latin transcription
and their translations *in the mixed order*:
in the third place; in the fifth place; every fifth; 1/9; 1/4; 1/6; in groups of 9; triangle; fourth; hexagon (a figure with 6 angles); sixth.
What you should do:
- find out which translation corresponds to which arabic word;
- provide transducer(s) that take as their input some description of the form (you can provide short notations, for example use "3a" for "third") and output the corresponding arabic translations (not only the ones from the list, but all the forms listed here for all the numbers in the problem!).
Hint: arabic words consist of the root that includes 3 consonants and the pattern that includes additional affixes + vowels inserted between the root consonants + sometimes the doubling of the middle consonant).