Performances in Egypt
For my project, I decided to explore the link between theatres in Egypt and the type of performances they showcase. As a dancer, the cultural and artistic elements of different parts of the world and times have always interested me, so I thought this would be a great opportunity to learn more. At first, I wanted to explore which theatres were more prevalent in the Gazette, but as I explored previous projects, I realized that it was already done and that I had to focus on fixing the issues that I saw with other projects and use the corrections to be able to say more with the data available.
The main issue that I saw with works preceding mine was that the x-paths followed, and queries performed were focused on the Calendar of coming events. While that is a good approach, considering that this is the place we would find the most information about events happening at the theatres, we also have to take into consideration flaws in our database. Looking exclusively on data tagged with the Calendar of Coming Events feature (@ComingEvents) would be a good search if we knew all instances of the Calendar of Coming Events were tagged with said feature and encoded in the same format.
To verify that my statement is correct, I searched for Calendar of Coming Events by name and by feature and compared them with the number of issues encoded. To calculate the number of features encoded I used the X-Path
//div[@type="page"][@n="1"]. Since these tags are included in the template with which we create and encode our files, every file has page 1 encoded and tagged this way. Looking for page 1 will therefore give me the number of issues encoded and uploaded to the repository, which are 956 issues. Using the X-Path
//div[@feature="comingEvents"] I counted the number of issues that have the section tagged with the respective feature, and I found out that only 609 issues have the Calendar of Coming events tagged with the feature tag. When looking for Calendar of Coming Events using a regular expression, I found 635 mentions of it, proving my point of the flaws in the encoding of the database.
Because of that issue, I decided that my queries would not be focused on the feature for the Calendar of Coming Events. To start my project, I wanted a list of theatres to focus on. To find this list, I used the regex
\w+ [A-Z]\w+ Theatre. Since not all of the results were useful, I cleaned up the results in Sublime text, and counted the instances in Microsoft Excel, to rank the theatres from most to least mentioned, and therefore find which theatres I should focus on.
|San Stefano Theatre
|Abbas Helmy Theatre
Knowing the the feature @comingEvents was not reliable, I found an alternative way to format my searches. Since the performances are indeed listed in the Calendar of Coming Events but the feature is coded differently per issue, I used the X-path
//div[contains(., 'Calendar of Coming Events')]/p[contains(., 'Zizinia Theatre')], To find paragraphs that included Zizinia Theatre in divs that contained Calendar of Coming Events. I would have made my x-path look for the Coming Events headline specifically instead of just anywhere in the div that contained the words, but I noticed that not all of the issues had the title formatted as a headline. After cleaning up the results to include only the date (issue) that included the mention, and the actual performance or show, I copied and pasted the results into an excel sheet and proceeded to follow the same steps for the other theatres, substituting the theatre name in the x-path.
Once I had all the results for the different theatres, I created two tables. For both I included date of mention, theatre name, and type of presentation or form of entertainment. The difference between the two is how detailed I was when categorizing the types of entertainments. For the first one, I tried to be as specific as I could be, so the results could be compared, for example, between two really similar categories, like French Opera vs Italian Opera. For the second table, I grouped like categories into one, to make the graph easier to understand and for it to need a bit less effort to read, since there are less categories.
For figure 1, I used the single count dates (day/month/year) as columns, the type of event features as rows, and represented the different theatres with different colors in the representation. Due to its shorter height, this representation does an excellent job highlighting which events were happening around the same time, and in which theatres too. For example, January 1907 shows an opera in Zizinia Theatre, a Performance in Alhambra, and a play in the Esbekieh Gardens Theatre happening in the same week. This is also the case of July 1906, where we see a Ball and a Concert at San Stefano, and a comedy at Alhambra theatre. Because of the inconsistencies in coding, the dates shown are dates in which the events were mentioned in the Calendar of Coming Events and not the day the events were happening. However, I do not consider this to be so much of a problem since the Calendar of Coming Events only shows a week of events in advance, so the dates are pretty close. Since the events are mentioned daily in the Calendar of Coming Events, we do see multiple instances of an event being repeated in a short amount of time, but since this is something that happens across all theatres and all events, it is also not a problem. If anything, this could show how important an event is, since the more times mentioned in the Calendar of Coming Events would mean that more “advertising” for the Event is being done. Since the conditions of the search are the same for all events and all theatres, I would say that if a certain event is shown more than others, it is likely a more popular event or an event of more importance to the readers. For figure 1, I also reduced the number of events I showed, merging similar events into a single category. The Italian Opera, Juvenile Opera, and French Opera, for example, are all depicted in the Opera category.
For Figure 2, I used type of entertainment as columns and exact dates as rows. I also included the different theatres identifying them with different colors in the graph. This graph clearly shows how mentions of events in theatres increased over time. This could be due to the number of issues encoded from each year, and the fact that we have not coded the entire newspaper. This could also be due to the different ways each student codes their issues, changing the results that the x-path search would yield. Another thing that we can see in this representation is what events are showcased in more than what theatre. French Comedy, for example, was shown in Esbekieh Gardens, Zizinia Theatre, and Alhambra. Similarly, the Italian Opera was also present in different theatres, namely Verdi, Abbas Helmy, Alhambra, and Zizinia Theatre. From this figure we can see a clear shift or alteration in where the events take place. While Italian Opera was predominantly shown in Abbas Helmy in 1905, it was mainly shown in Zizinia in 1907. Other events, like concerts and ballet, remained exclusively in one theatre. Ballet performances were shown in Alhambra only, and Concerts happened in San Stafano Only.