[MUSIC] Today, we will start talking about analytical approaches to network analysis. The lectures title sounds ambitious. Network analysis as a method. Does that mean we're going to talk about everything that we can do with networks? Of course not, it's just the beginning it's an introduction into network analysis as a method. In our lecture today we will have two distinct parts theoretical and practical. First, starting with network analysis is a method. We will define important terminology basically will outline the terms we will need throughout the lecture. Once we have done that will move on to network data. Network data are very different from regular data. We'll talk about collections and all other aspects of data for network analysis. Then we'll move on to study design. Because networks are special. You have to take into account many different things before you can design a good study. And then we'll start working with a method and talk about network descriptive statistics. However, I find it very boring to talk about analytics. Any analytics but especially network analytics with just the terminology. So we're going to move to descriptive analysis in practice. I will introduce you to a key study. It's called The Story of a high turnover. It's actually a very exciting story because it's based on a real life data set from a real life Russian company moreover. This data set is not available to anyone else. We collected it for special study by our laboratory. Next we will take a short method a logical digression. We'll talk about loathing and manipulating network data in our. Do not worry I did not forget that we didn't do much in her yet. So we'll start doing that today. Then we'll work with our data. From the first study. We'll talk about the story of four different relationships, analyzing it through graphs. And finally we'll make one moment of the logical digression. We'll talk about different ways to draw networks using different our packages. Okay, let's talk about important terminology before we move on to terms though, please allow me to make one caveat. Traditional statistics can be broadly divided into two types of analysis, descriptive with plots and graphs and numbers as a way to describe data and inferential where all the complex methods belong. Network analysis as a methodology follows the same general principle, though the lines are more blurry. Do traditions of information the descriptive methods in networks provide. There are some network analytic approaches that we buy tradition considered descriptive. However, as we have seen in the first lecture, some traditional descriptive methods such as graphs and networks can also be inferential. More or less descriptive analytics methods, graphs and matrices than some measures such as centrality, prestige and related actor and group measures. There is also structural balance, cluster ability and transitive. Itty as descriptive characteristics of networks. We can also talk about cohesive subgroups as characteristics of networks or special features inside a larger whole network. We can talk about affiliations, co memberships, overlapping subgroups. There also descriptive measures and finally diabetic and traumatic methods. They are still considered descriptive even though most of the methods outlined here are influential. And then of course there are inferential methods because they involve modeling some of them quite complex. Some methods, such as block models also remain purely descriptive, it all depends on purposes. More or less in financial methods are structural equivalence measures and block models here, we can talk also about relational algebra, even though that's more the topic of domain of graph theory in our course, will not be using relational algebra at all. Then generalizations of roles and positions and finally, statistical models. Statistical models exist in a variety of different ways for single and multiple relational networks, stochastic block models and goodness of fit indices, longitudinal analysis and variety of model combinations explaining social behavior such as diffusion models. Now you have probably heard some terms you don't know already. So let's move to definitions and we'll start with network nodes that we have talked about already. In networks, their terms that describe the local and the global network structures, even though we talked a little about them in the first lecture, let's formalize those definitions. A network is a set of relations, a description of which includes a set of nodes, objects or actors and a set of connections or ties between them in a social networks. Nodes can be people groups, organizations, countries, physical or cultural objects created used by people. People start activities or anything else. Social nodes can have attributes of course. The demographics for people such as gender age, part regulation for organizations that could be organizational size, country and industry affiliation. So anything you can think of, anything that you can analyze as regular data with regular statistical approaches could be an attribute, connections of ties is something that establishes linkages between nodes. In other words, their relationships, for example, evaluation of one person by another, such as in friendship or liking. We can ask a person to name their friends, but then also to estimate the strength of relationship with those friends on a scale. For example, from 1 to 5 ties can also indicate, for example, transfer of material resources such as business transactions, giving gifts, borrowing money. This is very important for the banks, ties can indicate association or affiliation, for example, attending an event or belonging to the same club or organization. This is very important in affiliation networks. For example, if two people belong to the same club, do they necessarily know each other ties can also be the hero interactions, such as sharing news, talking, sending messages, asking for advice, advice networks especially important in organizations. So we can also design them for network study connections could be movements between places or statuses such as migration, social or physical mobility ties can also establish formal authority, for example, subordinates and supervisors. And finally, well, maybe not the final point, but an important point. Biological relationships such as kinship or dissent. It's not enough to just indicate the type of relationship. You also have to quantify it. First of all, relationships can be directional versus non direction. What's the difference? Think about it? If one person likes another, that's another will not like the first person back. So there is only one directed tie from person A to person B. However if persons A and B. A husband and wife then they're automatically spouses to each other. Relationships can also be dichotomous or valued. They cut the most relationships indicate only a presence of an absence of a tie. If the relationship is present it's a one, if it's absent then it's zero valued relationships. We just talked about when we ask correspondence to evaluate the strength of a tie, let's say to a friend and we asked them to do that on a scale from 1 to 5 from not very strong to the strongest possible relationships can also be positive and negative. Imagine we have a love network and the heat network. The strength of thai over five on the love network would mean very strong love. The strength of tie on the five on the heat network would probably mean very strong heat. It's just easier to make those guys negative. Now, can you easily quantify all the relationships, things like geographic proximity? You probably can, we can measure that in miles or kilometers talking on the phone. How much time do you spend talking to your friends again? You can quantify that in minutes or hours. Some things such as going for advice, friendship or attraction are much more difficult because we have to give our respondents scale and explain what each point on that scale means. Otherwise we're going to get very different responses on a very difficult question. No, after we have quantified relationships, we can put them into some system for representing networks. There are many different ways we'll start with the socio graham because we've seen before. It's just a graph representing social relations points or nodes represent our actors, whoever those actors might be. And edges will also call them lines or arcs drawn between nodes if there is a relationship and not drawn if there is none. The network matrix is a matrix to a table of rows and columns. Depending on the network type matrices can be symmetric, which is an adjacency matrix for nodes of one type or as symmetric, which is an incidence matrix for nodes of two types, we haven't talked about that yet but think about people belong to an organization. We're going to have people in the rose and organizations in the columns. So the matrix will not be symmetric. Well, it all comes down to the data, does it not? Data matrices can be of two types as they mentioned again, it's either zero or one representing presence of a tie, zero or another value. Where values show not only the presence of a tie but also the strength of the type. The network structure refers to the presence of regular patterns in relationships and structural variables is something that we're trying to find. There are any quantities that measure structure density, rich ability, centrality, strength of ties, transitive, itty, clustering, home awfully equivalents, reciprocity, cohesiveness, distance, network diameter and they can go on and on and on. Trust me, you're going to know what all of these terms mean By the end of the course, more than that, you will be able to use them to analyze your own networks. Now networks data can be summarized in a variety of ways. We start of course with an actor but the actor by itself is a meaningless number in the network. We need a dyad. A dyad represents a pair of actors and possible relation ties between them and think about it is absence of thai also a type. Yes, it is. So we have four different types of ties between our diets between our actors and a dyad. Null dyad is an absence of thai. Mutual dyad is a reciprocal relationship and then we have to direct ties. Triad is a sub graph consisting of three nodes and all the possible lines between them. Now, if we don't label those nodes, there are 16 possible triads. We'll talk about that when we talk about traumatic analysis. It's actually a fascinating method where you can talk so much about your network knowing just the traumatic composition. Why is that? Because triad is the elementary unit of structural analysis in the network. And then of course we have subgroups. There are two basic reasons why we're interested in defining and detecting subgroups in network data. The theoretical is to provide the structural definitions of the sociological notion of a group. What is our group? The practical is to obtain data reduction and descriptions of patterns of observed diabetic cohesion. We assume that groups will be more homogeneous with respect to key variables than other sub crafts. Either because of the harmful or because of the fusion. And then of course we can break up the network into set of actors, maybe sometimes even selecting random actors from the network to see what kind of structures they represent. Or we can analyze the entire net when reading about networks or hearing others describe them. You will encounter some notations. So it's important. We talk about them. We distinguish between graph theoretic and social metric, though of course to refer to the same things. We're just they're just concerned with different aspects of the analysis in graph theoretic notations we talked about in the nodes or a set of actors represented by nodes usual, we have an equal to some set of and one and two and three and so on. For each note that we have, we have G. Actors or nodes in that set in el is a set of lines or arcs representing relational ties between pairs of actors and L. Lines exist in L. Set for multiple relations. We can have our relations and our sets of arcs. What does it mean multiple relations? Well, that means on the same nodes, we have collected a network of friendship and network of advice and sometimes they might be different. In socio metric notation we also have an exit refers to single relation and ex IG. Is the strength of value of the thai from actor I to actor J. And then we have a social matrix X. The size of G by G. Because it represents all the connections between our nodes, rows and columns of the social metrics. Index the actors in identical order. So if our roles go in the order A B C D E and all columns have to follow the same order A B, C D and E. Now there are a few more terms that are important. The first word is adjacent. It describes nodes, nodes ni and nj are adjacent of there is a line lk connecting ni and nj that belongs to a set L. The word incident refers to line, and nodes ni and nj are incident with the line if lk connects ni and nj. Sociogram points the big nodes. A line is drawn between two points. If there is a line in our set. Now I realize there are lots of terms here. We do not need to memorize all of them. They will become a natural part of your vocabulary once we get through a few examples of network analysis. So before we will move on to additional network concepts, we need to talk about networks in general the types of networks that we can work with. Of course it all comes to data. So our next topic is network data types and network data collection. [MUSIC]