JEB133 Data analysis
The lecture and seminars on Data Analysis within the Economy History course (No. JEB133) will be organized as follows: the lecture on 03/15 will be replaced by an on-line self-guided tutorial demonstrating a parliamentary data analysis. Please, study this tutorial in detail. The seminars on 03/15 and 03/22 will be replaced by an on-line Zoom session on 03/22, at 2:30 p.m. to address students' questions related to the assignment (see below). Here is a Zoom link to join the session https://cesnet.zoom.us/j/2592348083?omn=93750870963 and if no one join the session by 2:45 p.m., I close the session.
The students must complete a project assignment. The motivation behind the assignment is to encourage students to use data in their projects. The assignment is a practical hands-on training where the students will be using the collections of parliamentary data compiled in the ParlaMint project. Please, check the list of ParlaMint collections available in KonText https://www.clarin.si/kontext/corpora/corplist > ParlaMint.
Project assignment
Answer the following questions
-
How many times the speakers in a parliament of your choice have talked about a topic of your choice?
E.g. I am interested in the debates in the Czech parliament on the topic of vaccination (očkování in Czech). Therefore I choose the collection ParlaMint-CZ 4.0 (Czech parliament) and run the query [lemma="očkování"].
-
How did the overall frequency of the text change over months?
Proceed analogously to the solution demonstrated in the tutorial
-
Choose the collection in KonText - work with its version 4.0 (or 2.1).
-
Create a query to answer your questions (1) and (2).
-
Extract data from KonText and upload them to a Google spreadsheet.
-
Create a pivot table to answer the question (1) and a plot to answer the question (2) - see pp. 15-18 in the tutorial.
-
Create one more sheet in your spreadhseet and describe your results. Your description should not be longer than 500 characters (including spaces).
-
Share your spreadsheet with vidohlad@gmail.com by April 2, 2024. Then I will score your results: the maximum score is 6 points and the minimum score is 1 point.
Feedback
March 26, 2024
I have looked at several submitted Google spreadsheets (Thank you!) and based on that I have a few comments:
-
Don't forget to sort the pivot table with the speakers by the COUNTA of name attribute in descendening order (see p. 16 in the tutorial)
-
Check that you have the correct axis labels in the plot: x axis yyyy-mm and y-axis COUNTA-of-yyyy-mm (see p. 18 in the tutorial).
-
Edit the chart title as follows: list the corpus you used, e.g. GB-2.1, and the query you submitted, e.g. [lemma = "leave"][lemma = "the"][lemma = "European"][lemma = "Union"] (see p. 18 in the tutorial)
-
Please save your charts in png format (see p. 19 in the tutorial) and upload them to a shared Google Photo Album.
Micro Gallery