The first posts will be about acquiring some data sets to play around with German language in general. I'm currently learning about German declension, i.e. how to form articles, nouns, and adjectives according to grammatical gender, case, and singular/plural. I want to dig a bit deeper into this topic and look at it from different perspectives using statistics, NLP, and more.
German has four grammatical cases: nominative, genitive, dative, and accusative. They are used to form complex sentences without losing information about the arguments. What do I mean with arguments? In the example sentence in English ‘I give you the laptop’, we have the verb ‘give’ (or predicate to be more precise). The predicate has three arguments, namely ‘I’ (subject), ‘you’ (indirect object), and ‘the laptop’ (direct object). In this example, we utilised the word order to signal who gave what to whom. In German, the three arguments are formed using the grammatical cases. The sentence is translated as: ‘Ich gebe dir den Laptop’. We have the predicate ‘gebe’ with three arguments ‘Ich’ (subject), ‘dir’ (indirect object), and ‘den Laptop’ (direct object). These are formed using nominative, dative, and accusative, respectively. For complex sentences, it can be helpful to identify the subject, object, or indirect object. The genitive case is used to mark possession, e.g. ‘Die Bäume des Gartens’. Besides the typical usages, the cases are also applied in many other situations. It's impossible to sum up all the details in one blog post. The cases are outlined in the following table.
|Nominative||Subject||Das Auto ist gelb|
|Genitive||Possession||Die Räder des Autos|
|Dative||Indirect object||Das Auto gehört mir|
|Accusative||Direct object||Ich fahre das Auto|
The grammatical cases are of principal importance when building sentences, but you also have to take the grammatical genders into account. To complicate matters, there are not just one, but three: masculine (der), feminine (die), and neuter (das). Besides a few thumb rules, you have to learn the genders of all nouns by heart. However, in some cases, knowing which gender goes with what noun, helps you to parse and understand more complex sentences. The genders are listed in the following table.
These rules combined forms the principles of declension. For example, we can create the following table of the determined article for all cases and the genders plus plural. It aids in translating ‘the «noun»’, whether singular or plural, and as subject, indirect/direct object, or possessive, to German. You should however beware, that you might need to add a suffix to the noun as well.
The theory is all good, but how does it look in practice. To research the topic of declension we need some data. We need a list of words, word type, and possibly the root of the word or how it is declined.
The next article will look at Universal Dependencies, and what a word really is.