Kansas State University

search

IT News

The NodeXL Series: Using VOSON for Hyperlink Network Analysis (Part 9)

The NodeXL add-in is set up to work with a variety of other systems to import data.  These are referred to as Third Party Graph Data Importers in the system.  One of these systems is known as the Virtual Observatory for the Study of Online Networks (VOSON), created by Uberlink  (There is a page which addresses the overlap of NodeXL and VOSON here).

Enabling VOSON Functionality

To add this functionality to NodeXL, go to Uberlink’s main site.  To access the VOSON + NodeXL software release, you need to create a verified account.  Once that has been created, download the appropriate software, and install it per the instructions.

The VOSON System is a web-based software that enables the collection and analysis of online network data.  A human being at Uberlink vets the data crawls.  The VOSON Data Provider for NodeXL enables access to the VOSON hyperlink network data collection services from within NodeXL.  This Data Provider works for both Windows 7 and Windows XP.  Once the latter DLL file has been accessed, it must be downloaded to the PlugIns folder for the NodeXL Excel Template.  (This assumes that the recent versions of Excel and NodeXL have been installed properly.)

Using VOSON

Please start NodeXL.  Go to the NodeXL tab, click on Import > Get Third-Party Graph Data Importers > VOSON Hyperlink Networks.  The screen capture below shows this.

This will take you to Uberlink’s site, where you can learn more about the VOSON system.  An account is fairly easy to set up but will require a valid .edu email.  The application process also requires the sharing of information on how the data crawls will be used.

Downloading the Software

After logging into the Uberlink site with an authenticated account, download the .dll files as indicated.

The requisite NodeXL_VOSON_Spigot.ver_0-5-4-15.dll file will be downloaded.  Place the .dll file in the C:  Program Files> Social Media Research Foundation > NodeXL Excel Template > Plug-ins.

Opening the NodeXL Excel Template

The next time the NodeXL Excel Template is opened, the VOSON system will be able to be accessed directly.

So to start the VOSON crawl, start up the NodeXL Excel Template.  Click on the NodeXL tab.  Click on Import > “From Web 1.0/Blog Network (via VOSON).

A VOSON login table will pop up.  Note that there are two tracks.  One VOSON@ANU is for academic research and likely enables broader access to Uberlink servers.  The latter VOSON@Uberlink is for general use.

Use the dropdown menu for the VOSON Provider, and go to VOSON@Uberlink (unless you have an ANU account for academic research and teaching.)  Put in the user name and password.  Then click “Logon.”  Once the account has been authenticated, the following window will show with available prior networks.  This window lists names, the name of the project, the size, and the date it was created.  (The following screenshot has been redacted.)

A user may select a former data extraction and refresh it with a new crawl.  Or, he or she may build a new one.  For this blog entry, the “Build new” option will be used. The “Create VOSON database” window will appear.

First, it is important to label the database.  (“AN” will be tagged on the end of all database names.)  For this website crawl, EDUCAUSE’s site (http://www.educause.edu/) will be used.  “EDUCAUSE” will be the database name. For the comment, the URL will be inserted.  The inbound crawl default setting is 1000 inbound links; the outbound crawl also has a default setting of 1000 outlinks (these are the high ends on both).  For the outbound crawl, the maximum unproductive pages listed will be 25, and the depth of the crawl is at 50 pages.

Finally, users may decide whether or not to collect the favicon.ico.  The favicon.ico (http://www.favicon.cc/) is a tool that creates icons to show in the browser’s address bar.  “favicon” stands for “favorite icon” (but may also be known as shortcut icons, URL icon, bookmark icon, or website icon, according to “Favicon”  http://en.wikipedia.org/wiki/Favicon  in Wikipedia). These icons are usually 16 x 16 pixels and do not take up much in the way of memory, so the default checkmark to collect these icons will be left in place.

Click “Create database.”  The new crawl will appear at the top of the window.

Next, log out by clicking on the top right x to exit the window.  Close out of the NodeXL file.  (The “Do you want to save the changes you made to ‘NodeXLGraph1’?” message may be responded to with a “Don’t Save”.  The NodeXL template will not be used until the crawl is actually completed and available for download from Uberlink.

 

The next step is to wait for an email from VOSON to say that the crawl is complete.  If Uberlink decides that the crawl is too large to handle or something that they do not want to pursue, they will simply not respond.  (There is very much human intelligence and will involved in this process. This is an automated crawl which requires human approval.  Many organizations that create free tools for the broader public have become much more savvy in terms of ensuring that their tools are not abused or mis-used.)

The Wait

(At the time of publication, Uberlink had not sent any communications.  To meet the requirements of the publication, this was published out by the editor.  More information will be offered as it comes in.) 

A check of the records shows that the crawl was not actually completed.

Recrawls

The same VOSON tool (linked to NodeXL) may be used again for recrawling based on the same or different parameters.  The accounts, crawl history, and crawl parameters are recorded on the Uberlink “cloud” and must be accessed there.

An internet crawl shows the connectivity of a website in terms of how they point to and connect with each other.  This crawl may be applied to Web 2.0 technologies like publicly available wikis and blogs.  (There are other tools which enable such crawls, such as Maltego Radium.)

An Uberlink Community

Once logged in, users may visit the Uberlink Community at http://www.uberlink.com/community.  An outdated manual is downloadable from the site.  Users may email support@uberlink.com for some support.

Final Note:  NodeXL is a free and open-source tool that is available from Microsoft’s CodePlex site  (which is a space for project hosting for open-source software), and it is sponsored by the Social Media Research Foundation.

Share this post: