Introduction to SNA

Social Network Analysis

Termeh Shafie

Lecturer

Termeh Shafie


Statistician
Statistical Modeling of Networks
R enthusiast

and you?

Overview


Tuesdays 13:30-15:00
Lecture, discussions and live coding


Wednesdays 10:00-11:30
Lab, work sheets and self-study

What do we cover?

Research Design
Data Collection
Methodology

(and some theory)

Why study networks?

conventional research methods are often individual based and our models tend to model relations between variables

but nature and culture is structured as networks

  • society
  • brain (neural networks)
  • organizations (who reports to whom)
  • economies (who sells to whom)
  • ecologies (who eats whom)

Position within a network is important for predicting outcomes

Why study networks?

Simmel, 1908/1971:

Society exists where a number of individuals enter into interaction

Durkheim, 1974:

Society has for its substratum the mass of associated individuals. The system which they form by uniting together […] their channels of communication [are] the basis from which social life is raised

Marx, 1973:

Society does not consist of individuals, but expresses the sum of interrelations, the relations within which these individual stand

Why study networks?

…but conventional research methods are often individual based and our models tend to model relations between variables, not people

Example

David eats predominantly vegetarian food

individual-based:

  • ethical
  • economic
  • health
  • taste

network-based:

  • vegetarian partner/friends

Why study networks?

…but conventional research methods are often individual based and our models tend to model relations between variables, not people

Example

Someone close to you is unhappy…

…will you remain unaffected?

Why study networks?

…but conventional research methods are often individual based and our models tend to model relations between variables, not people

Example

equal opportunities based on our individual qualities…

…or on our personal networks?

From “ordinary” to network data

From “ordinary” to network data


atomic data
individuals or entities

dyadic data
dependent pairs of individuals (e.g. couples)
but treated as independent entities

networks
interdependent and overlapping dyads
usual (statistical) independence assumptions do not hold

4 pillars of network analysis

  1. Social network analysis is motivated by a structural intuition based on ties linking social actors

  2. It is grounded in systematic empirical data

  3. It draws heavily on graphic imagery

  4. It relies on the use of mathematical and/or computational models

Brief history of SNA

Beginings in Sociology

Georg Simmel (1858–1918)

  • pioneered the concept of social structure
  • developed early structural theories
  • dyads and triads

If there is to be a science whose subject matter is society and nothing else, it must exclusively investigate these interactions, these kinds and forms of sociation.

Sociometry

Jacob Moreno (1889–1974)

  • sociometry
  • sociogram
  • together with Helen Hall Jennings

Sociometry

Jacob Moreno (1889–1974)

  • sociometry
  • sociogram
  • together with Helen Hall Jennings

Network position and outcome

Alex Bavelas (1913-1993)

  • information diffusion within small groups
  • influence of network structure on efficiency
  • developed the concept of centralization

Network theory

Elizabeth Bott (1924–2016)

  • Manchester School of Anthropology
  • developed the first network theory

Bott hypothesis
the density of a husband’s and wife’s separate social networks is positively associated with marital role segregation

And then the Physicists came

Barabási/Watts & Strogatz

  • Preferential attachment/Small world
  • SNA vs. complex networks
  • Popularized the network concept

I expressed the pious hope that […] our colleagues from physics would simply join in the collective enterprise. That hope, however, was not immediately realized. These physicists, new to social network analysis, did not read our literature; they acted as if our sixty years of effort amounted to nothing… (L. Freeman)

Doing SNA

Levels of Analysis

dyad level
Fundamental unit of network data collection
(“Does sharing offices lead to friendship?”)

node level
Aggregation of dyad level measurement
(“Do actors with more friends have a stronger immune system?”)

network level
Assessing overall structure of a network
(“Do well connected networks diffuse ideas faster?”)

more levels are possible (triads, groups, …)

Types of relations I


Relational states

  • Similarities: location, participation, attribute
  • Relational roles: kinship, other roles
  • Relational cognition: affective, perceptual

Relational events

  • Interactions: sold to, talked to, helped, …
  • Flows: information, belief, money

Types of relations II


undirected
symmetric relation

directed
asymmetric relation, but can be bi-directional

valued
strength of relation, frequency of contact, etc.

signed
positive and negative relations

or a mixture thereof

Goals of analysis

Network variables as independent/explanatory

Using network theory to explain the consequences of network properties

social capital, brokerage, adoption of innovation

Network variables as dependent/outcomes

Using ______ theory to explain the antecendents of a network

homophily, balance theory

Node Level

type independent dependent ex. hypotheses
network theory node level network property actor attribute centrality ⟹ performance
theory of networks actor attribute node level network property good looks ⟹ centrality

Dyad Level

type independent dependent ex. hypotheses
network theory network tie attribute similarity friends ⟹ similar interest
theory of networks attribute similarity network tie smoking ⟹ friendship

Some Examples

The strength of weak ties

Strong ties have redundant information for individuals
Weak ties spread information between groups

Link to paper

The spread of obesity


A person’s chances of becoming obese increased by 57% if he or she had a friend who became obese in a given interval […] These effects were not seen among neighbors in the immediate geographic location.

Link to paper

Lethality and centrality

The correlation between the connectivity and indispensability of a given protein confirms that, despite the importance of individual biochemical function and genetic redundancy, the robustness against mutations in yeast is also derived from the organization of interactions and the topological positions of individual proteins.

Link to paper

Social licking among cows

Link to paper

Most methods rely on concepts from graph theory

R ecosystem for networks

Why R?



open source


cross-platform


CRAN


reproducibility


more than SNA


community

Packages for basic SNA


igraph

  • offers efficient data structures
  • implemented in C (also available in python)

sna

  • relies on the network package
  • “clone” of UCINET

Package dependencies

CRAN packages that depend on igraph, network, and graph

Which package to choose?


use igraph if

  • you need speed (large networks)
  • you need to use other SNA packages

use sna if

  • you need to do modeling (e.g. ERGMs and RSIENA)

does not make a difference in most cases, never load them both!

library(igraph)