In this practical, we will reproduce the experiments from Al Anaissy et al. 2024.

For this practical, you will need to download their code from Github. Alternatively, you can download the ZIP file below (obtained the 6th of August 2024).

Exploring the code

Look at the files in the data folder, can you infer which files were generated by the authors and which were originally from the datasets?
From the src file, can you infer which files were used to create the intermediate dataset files?
What is the purpose of the angrymen_custom_dataset.py and debatepedia_custom_dataset.py files in the classification folder?
Before starting, create a Python environment with all the necessary libraries.

Generating the dataset

From the first section, you must have an initial grasp of the code. In the rest of this practical, we will focus on the angrymen dataset only (but you can perform this again for the debatepedia dataset if you have time).

As seen from the TwelveAngryMan_arguments.csv file, the authors have generated 100 initial weights for each probability distribution (poisson, uniform, normal, and beta). The TwelveAngryMan_entailment.csv contains the information about the attacks in the graphs. Use the Quad_semantics.py code to generate the corresponding degrees for the Quad semantics.
Then, use the Quad_gold_label_trucate.py to convert the obtained degree files (step 1) into .txt files.
Similarly, create node feature files in .txt format using the code infeatureExtraction.py.

Reproducing the experiments