Article de Geyken, 2006

Solve your problems or get new ideas with basic brainstorming

Jetzt loslegen. Gratis!
oder registrieren mit Ihrer E-Mail-Adresse
Article de Geyken, 2006 von Mind Map: Article de Geyken, 2006

1. Part 5: "text selection"

1.1. prose, verse & drama

1.1.1. 26% of DWDS corpus

1.1.2. every year between 1900 & 1999

1.1.2.1. 3 longer prose works selected

1.1.2.1.1. two longer one

1.1.2.1.2. one of light fiction

1.1.3. selection process

1.1.3.1. project team

1.1.3.1.1. provisional list based on

1.1.3.2. members of Academy of Sciences

1.1.3.2.1. 3 specialists in German studies

1.1.3.2.2. comment on that list

1.1.3.2.3. nominate those texts

1.2. newspapers

1.2.1. 27%

1.2.2. newspaper reports & periodical articles

1.2.2.1. from 50 different national & regional newespapers/magazines

1.2.3. from 1900-1933

1.2.3.1. newspaper samples taken in regular intervals

1.2.4. from 1933-1945

1.2.4.1. "Völkische Beobachter"

1.2.5. from 1945-2000

1.2.5.1. regular newspapers

1.2.6. in addition

1.2.6.1. samples of specific events

1.2.6.1.1. 1900: World Fair in Paris

1.2.6.1.2. 1901: Nobel Prize award ceremony

1.2.6.1.3. 1902: the end of the Boer War

1.2.6.2. idee behind that

1.2.6.2.1. certain words/expressions

1.3. science

1.3.1. 22%

1.3.2. + than 100 members of Academy of Sciences

1.3.2.1. DWDS projetc

1.3.2.1.1. each decade

1.3.2.2. for works

1.3.2.2.1. most important for their disciplines

1.3.2.3. all major scientific disciplines & field of knowledge

1.3.2.4. results of this survey

1.3.2.4.1. basis for the science corpus

1.3.3. selected each year

1.3.3.1. 1 important scientifix monograph

1.3.3.2. 4 articles from scientific journals

1.3.3.2.1. in rotation between scietific disciplines

1.4. other non fiction

1.4.1. 20%

1.4.2. subcorpus

1.4.2.1. self-help literature

1.4.2.2. texts rarely considered in lexicography

1.5. transcriptions of spoken language

1.5.1. last 2 decades

1.5.1.1. corpora

1.5.1.1.1. everyday conversations

1.5.1.1.2. radio & television interview

1.5.1.1.3. recording of dialectal speech

1.5.2. resources for transcription

1.5.2.1. large amounts of unscripted conversation

1.5.2.1.1. still substantial: collect transcriptions of non spontaneous speech

1.5.3. DWDS Kerncorpus

1.5.3.1. 200 samples of radio interviews

1.5.3.1.1. before 1945

1.5.3.1.2. after 1945

1.5.4. texts from Austria & Switzerland

1.5.4.1. deliberately underrepresented

1.5.4.1.1. current corpus

1.5.4.1.2. plans for co-operation & coordination of corpus compliation

2. to provide a certain balance

3. BBAW

3.1. Berlin-Brandenburg Academy of Sciences

3.2. 2000-2003

3.3. core corpus

3.3.1. 100 million words

3.3.2. balanced

3.3.2.1. chronologically

3.3.2.2. by text genre

3.4. extended corpus

3.4.1. + 900 million text words

4. Main motivations

4.1. offer a German dictionnary

4.1.1. satisfactory representation of the lexicon - 20th century

4.2. order the words by

4.2.1. lexical categories

4.2.2. types of syntactic constructions

4.2.3. lexical field

4.3. filter out interesting words & word senses in a large mass of data

4.4. no satisfactory corpora

4.4.1. LIMA corpus

4.4.1.1. too small

4.4.2. IDS corpus

4.4.2.1. too focus on recent newspaper texts

4.4.2.2. no chronologically balanced

5. Spécifications

5.1. not too small basis

5.2. balanced chronologically

5.3. different kinds of influencial texts

6. PESTRE Marine L3 - SDL