Articles Datasets Protocols

Before start

Please run the script to check the title of the articles (it may change over time).

  1. Go to wikimole/scraper
  2. Click on Check article title

Articles network

The visualization aims to explore the correlation among the articles under consideration of the project.
Visualization: Every article is represented by a bubble. The big is the bubble the more incoming links it has. The close bubbles are the more incoming links they have in common. The colors represent the clusters of articles according to their modularity class.

Data scraping
  1. Go to wikimole/scraper/lib/scraper.py
  2. Run the function get_incoming_links() and get the file with the list of incoming links
Data parsing
  1. Go to wikimole/scraper/lib/parser.py
  2. Run the function count_co_occurrences() to get the raw file with the list of co-occurrences
  3. Create an Excel file and import the file gathered through the previous step
  4. Make a pivot table with the following columns: couple (source + target), count (number of co-occurrences)
  5. Import the pivot table in a text editor
  6. Replace "---" with a tab and insert the header: source, target, weight. Then you got the edges file
  7. Make the nodes file with the following header: id (article title), label (article title)
Data visualization
  1. Open a new Gephy project
  2. In the Data laboratory tab: Import the nodes file
  3. Import the edges file
  4. In the Overview tab: Run the modularity class (resolution 1)
  5. Go to Appearence/Nodes/Color/Ranking and choose the attribute Modularity Class
  6. Go to Appearence/Nodes/Size/Ranking and choose the attribute In-Degree (min size 3, max size 20)
  7. Run the Force Atlas 2, then apply the Prevent overlap
  8. Run the Noverlap layout
  9. In the Preview tab: set curved edges, nodes labels with no proportional size
  10. Export the svg file to be further edited in a vector graphics editors

Network of incoming links

The visualization aims to explore the correlation among the articles under consideration of the project and the articles linked to them.
Visualization: Every article is represented by a bubble. The big is the bubble the more incoming links it has. The close bubbles are the more incoming links they have in common. The colors represent the clusters of articles according to their modularity class.

Data scraping
  1. Go to wikimole/scraper/lib/scraper.py
  2. Run the function get_incoming_links() and get the file with the list of incoming links
Data parsing
  1. Go to wikimole/scraper/lib/parser.py
  2. Run the function get_edges() to get the edges file
  3. Make the nodes file with the following header: id (article title), label (article title)
Data visualization
  1. Open a new Gephy project
  2. In the Data laboratory tab: Import the nodes file
  3. Import the edges file
  4. In the Overview tab: Run the modularity class (resolution 1)
  5. Go to Appearence/Nodes/Color/Ranking and choose the attribute Modularity Class
  6. Go to Appearence/Nodes/Size/Ranking and choose the attribute In-Degree (min size 3, max size 20)
  7. Run the Force Atlas 2, then apply the Prevent overlap
  8. Run the Noverlap layout
  9. In the Preview tab: set curved edges, nodes labels with no proportional size
  10. Export the svg file to be further edited in a vector graphics editors

Incoming and outgoing links

The visualization visualize the balance between incoming and outgoing links.
Visualization: Every article is represented by bars visualizing the number of incoming links and outoing links divided for typology (articles, users, categories, templates, portals).

Data scraping
  1. Go to wikimole/scraper/lib/scraper.py
  2. Run the function get_incoming_links() and get the file with the list of incoming links
  3. Run the function get_outgoing_links() and get the file with the list of outgoing links
Data parsing
  1. Create an Excel file and import both the files with incoming links and outgoing links
  2. Create the dataset to be visualized like the following: wikimoindex.html/assets/data/20170803/in_out_links_2017.csv
Data visualization
  1. Import the dataset in the script like the following: wikimoindex.html/dataviz/in_out_links/in_out_links_all.js
  2. Export the svg file to be further edited in a vector graphics editors