Kansas State University

search

IT News

Category: Resources and tools

The NodeXL Series: Extracting a User Network from Flickr (Part 11)

In content networks like Flickr, there may also be social network extractions of user accounts.  This will capture the interrelationships of various individuals that have accounts on the site.  This will show linkages between the various accounts in terms of cross-references.  For this blog entry, we will use NodeXL Excel Template to extract the “USDAgov” user network on Flickr, which is a peer content sharing site including photos and videos.  The “USDAgov” is likely a fairly healthy-sized network, even in a content site, because of its government mandate.

Continue reading “The NodeXL Series: Extracting a User Network from Flickr (Part 11)”

The NodeXL Series: Conducting a Data Crawl of an “Event Graph” (Part 10)

An “event graph” is an emergent social network that is created based on how people interact with each other on a microblogging (or short messaging service) site surrounding an event or conference.  An “event graph” in academia usually refers to a presentation or conference around which a particular hashtag is used to identify Tweets specific to that conference.  Oftentimes, the backchannel chatter is captured as part of the digital artifacts from that conference and stored as part of the digital poster sessions.

Continue reading “The NodeXL Series: Conducting a Data Crawl of an “Event Graph” (Part 10)”

The NodeXL Series: Using VOSON for Hyperlink Network Analysis (Part 9)

The NodeXL add-in is set up to work with a variety of other systems to import data.  These are referred to as Third Party Graph Data Importers in the system.  One of these systems is known as the Virtual Observatory for the Study of Online Networks (VOSON), created by Uberlink  (There is a page which addresses the overlap of NodeXL and VOSON here).

Enabling VOSON Functionality

To add this functionality to NodeXL, go to Uberlink’s main site.  To access the VOSON + NodeXL software release, you need to create a verified account.  Once that has been created, download the appropriate software, and install it per the instructions.

The VOSON System is a web-based software that enables the collection and analysis of online network data.  A human being at Uberlink vets the data crawls.  The VOSON Data Provider for NodeXL enables access to the VOSON hyperlink network data collection services from within NodeXL.  This Data Provider works for both Windows 7 and Windows XP.  Once the latter DLL file has been accessed, it must be downloaded to the PlugIns folder for the NodeXL Excel Template.  (This assumes that the recent versions of Excel and NodeXL have been installed properly.)

Continue reading “The NodeXL Series: Using VOSON for Hyperlink Network Analysis (Part 9)”

The NodeXL Series: Conducting a Data Crawl of a Facebook Fan Page (Part 8)

Facebook is currently the foremost social networking site in the Western world.  Many individuals and entities create fan pages on this social network to be their public-facing side.  The ability to extract information from Facebook requires an authorized account.

To practice this data extraction, this will describe the extraction of the social network around the Hershey’s site on Facebook (https://www.facebook.com/HERSHEYS).  With 5.9 million likes, any crawl will have to be a limited one in order not to overwhelm NodeXL.

Continue reading “The NodeXL Series: Conducting a Data Crawl of a Facebook Fan Page (Part 8)”

The NodeXL Series: Conducting a Data Extraction of a YouTube Video Network (Part 7)

A content network consists of an analysis of related clusters of information.  Social media platforms that enable the sharing of contents align with research into crowd-sourcing and self-organizing behaviors, where individuals working often in isolation or in small groups share contents that benefit people on the whole.  One of the most popular digital content sharing sites is Google’s YouTube, where people may share videos of themselves.

An extraction of a video network is based on the metadata used to label the video contents, and this extraction will result in a related tags crawl.

Cat Videos

A popular meme involves videos of cats and their antics.  A search of cat videos on YouTube surface two talking cats, skydiving cats (filmed in front of a green screen), grumpy cats, cats v. dogs, and other themes.  This huge amount of human attention to cats has led to the phenomena of “catvertising” (using cats in word-of-mouth advertising).  In celebration of this theme, this blog entry will focus on a crawl of “cat” on YouTube.  (Also, “cat” is pretty disambiguated.)

Continue reading “The NodeXL Series: Conducting a Data Extraction of a YouTube Video Network (Part 7)”

The NodeXL Series: Conducting a Twitter User Network Crawl (Part 6)

Per the prior entry, if a hashtag search is very time-dependent and ephemeral / transitory, the user accounts and relationships created around entities (people, organizations, companies, robots, and “cyborgs”) tend to be more stable.  While the research does not necessarily show that a follower / following sort of reciprocal relationship means that all Tweets are read and engaged, these do show a sense of some initial commitment and public declaration of a kind of relationship.  (Those interested I the research may find that there are surprises, such as that popularity and positive word-of-mouth does not necessarily translate to sales commitments.  Further, there is sufficient system gaming by using ‘bot and other accounts that a more accurate read of a user network requires some more digging and critical thinking analysis.)

First, it helps to pick a “target.”  A search on a search engine of an organization’s name “and Twitter” will often lead to the account information. For our purposes, we’ll go with the Centers for Disease Control and Prevention (CDC), in part because they have a clear social media strategy to engage their constituents.

A Limited Crawl of the CDCGov User Network on Twitter

The official Twitter account for the CDCGov site is https://twitter.com/CDCgov.  (Do read the fine print carefully to make sure that you haven’t landed on a farce site.  There are many pretenders, some not-so-subtle, and others very elusively so.)

Continue reading “The NodeXL Series: Conducting a Twitter User Network Crawl (Part 6)”

The NodeXL Series: Conducting a Hashtag Network Search of Twitter (Part 5)

Sometimes, it’s interesting to learn what is being Tweeted (micro-blogged) about particular topics in real time along with who is posting which messages.  NodeXL enables the extraction of individuals engaged in a particular hashtag-labeled microblogging conversation through Twitter’s application programming interface (API).  This entry will provide an overview of how this is done.

Hashtags

A hashtag is a snippet of text prefaced with a # (hashtag or pound) sign which indicates that the message is focused on a particular theme or topic.  In Twitter, the microblogging site, various Tweeted threads are collected around hashtags for coherent 140-character conversations from people from around the world.

A hashtag search of Twitter, then, involves the extraction of entities (Twitter accounts representing people, robots, and cyborgs) who have conversed around a particular topic. The application programming interface (API) used in NodeXL only extracts hashtag searches Tweeted in the prior week and a half.

Continue reading “The NodeXL Series: Conducting a Hashtag Network Search of Twitter (Part 5)”

The NodeXL Series: Conducting a Crawl of Flickr for a Content Network (Part 4)

A related tags network on a content site shows the interrelationships between the textual metadata used to label particular images (or videos).  A content network may be built around a particular search term.  The tags are searched on a social media (sharing) platform, and instances of the term are discovered.  A graph is then created from the interrelationships between related terms.

Starting a Content Crawl of Related Tags on Flickr

To start a content crawl of related tags on Flickr, start up the NodeXL template.  Click on the NodeXL tab to open the ribbon

In the File area, click on “Import” to acquire the dropdown menu.  Highlight “From Flickr Related Tags Network…”

Continue reading “The NodeXL Series: Conducting a Crawl of Flickr for a Content Network (Part 4)”

The NodeXL Series: Visualizing NodeXL Graphs (Part 3)

To give a sense of the various types of graphs that may be “drawn” from data using NodeXL, this entry highlights some of the different types of graphs.  This entry will be created  using a data extraction from Twitter. All the graphs here will be taken from the same data set; the only differences in visualizations will come from the layout algorithms.   [This data crawl—more on this in later entries—was a 2-degree crawl of Pulitzer Prize-winning author Laurie Garrett’s (Laurie_Garrett) user network on the microblogging site Twitter, with an ego neighborhood limit of 100 persons (alters).  Her formal account has 3,768 Tweets, 228 following, and 2,558 followers.)

The Data Extraction

This data extraction required an over-night crawl because of Twitter-imposed limits per its application programming interface (API) and the size of the electronic social network.

Continue reading “The NodeXL Series: Visualizing NodeXL Graphs (Part 3)”

The NodeXL Series: Downloading NodeXL (Part 2)

Once you’ve made sure that your computer and version of MS Office (particularly Excel) is compatible with NodeXL, you may be ready to download this free add-in.

To download the NodeXL add-in, go to the CodePlex site.  At the top right is a button that reads “download.”

Continue reading “The NodeXL Series: Downloading NodeXL (Part 2)”