NetworkAnalyzer Settings
The following plugin settings can be configured by the user:
Store node / edge parameters in node / edge attributes
For every node in a network, NetworkAnalyzer computes its degree (in and outdegrees for directed networks), its clustering coefficient, the number of selfloops, and a variety of other parameters. NetworkAnalyzer also computes edge betweenness for each edge in the network. If the respective options are enabled, NetworkAnalyzer stores the computed values as attributes of the corresponding nodes and edges. This enables the users to apply different visualizations or to filter nodes or edges based on the values of the computed attributes. For a complete list of the computed node and edge attributes, please see the section Attributes.
Use expandable interface for the dialog that displays analysis results
If this option is enabled, analysis results are presented in a window in which all charts are placed below each other in expandable boxes. If this option is disabled, analysis results are presented in a window that contains tabs for the group of simple parameters and for every complex parameter. Users who wish to view simultaneously two or more complex parameters of one network, should enable this option.
Change colors for parameter visualization
This options allow the user to change the default colors of parameter visualization. Please see the section Visualizing Computed Parameters for more details.

Background color
The color of the background in the network view. It is initially set to the default Cytoscape background color. 
Bright color
This color defines the brightest color that parameters can be mapped to. By default its value is green. 
Middle color
This color defines the intermediate color, that parameters can be mapped to. By default its value is yellow. 
Dark color
This color defines the darkest color that parameters can be mapped to. By default its value is red.
Location of the help documents
URL of this help web page for NetworkAnalyzer. This also enables the local download and storage of this help page.
Network Interpretations
NetworkAnalyzer can perform topological analysis on directed networks (containing only directed edges) as well as on undirected networks (containing only undirected edges). An example of a directed network and an undirected network is given in Figure 1a and Figure 1b, respectively.
In Cytoscape, a network may contain only directed edges even if they are undirected in their biological context. Moreover, one network may contain both directed and undirected edges if the network is compiled by combining data from different sources.
In the situations described above, NetworkAnalyzer needs user input how to interpret the edges. Figure 2 depicts two examples of small networks in Cytoscape and their interpretations.
In (a), the network contains solely directed edges. Here, NetworkAnalyzer provides three possible interpretations of the edge directions in the network. The user has to select one of the interpretations for further processing of the network.
In (b), the network contains both undirected and directed edges. Note that undirected edges cannot be converted unambiguously to directed ones. Therefore, networks with mixed edges are handled as undirected ones.
Simple Network Parameters
Number of connected components
In undirected networks, two nodes are connected if there is a path of edges between them. Within a network, all nodes that are pairwise connected form a connected component. The number of connected components indicates the connectivity of a network – a lower number of connected components suggests a stronger connectivity.
Parameters related to shortest paths
The length of a path is the number of edges forming it. There may be multiple paths connecting two given nodes. The shortest path length, also called distance, between two nodes n and m is denoted by L(n,m). The network diameter is the largest distance between two nodes. If a network is disconnected, its diameter is the maximum of all diameters of its connected components. The diameter can also be described as the maximum node eccentricity (eccentricity is defined in the Attributes section). The network radius, on the other hand, is the minimum among the nonzero eccentricites of the nodes in the network. The average shortest path length, also known as the characteristic path length, gives the expected distance between two connected nodes.
Parameters related to neighborhood
The neighborhood of a given node n is the set of its neighbors. The connectivity of n, denoted by k_{n}, is the size of its neighborhood. The average number of neighbors indicates the average connectivity of a node in the network. A normalized version of this parameter is the network density. The density is a value between 0 and 1. It shows how densely the network is populated with edges (selfloops and duplicated edges are ignored). A network which contains no edges and solely isolated nodes has a density of 0. In contrast, the density of a clique is 1.
The number of isolated nodes may provide insight how the network density is distributed. Another related parameter is the network centralization [6]. Networks whose topologies resemble a star have a centralization close to 1, whereas decentralized networks are characterized by having a centralization close to 0. The network heterogeneity [6] reflects the tendency of a network to contain hub nodes.
In addition, the number of multiedge node pairs indicates how often neighboring nodes are linked by more than one edge.
Clustering coefficient
In undirected networks, the clustering coefficient C_{n} of a node n is defined as C_{n} = 2e_{n}/(k_{n}(k_{n}1)), where k_{n} is the number of neighbors of n and e_{n} is the number of connected pairs between all neighbors of n [17, 2]. In directed networks, the definition is slightly different: C_{n} = e_{n}/(k_{n}(k_{n}1)).
In both cases, the clustering coefficient is a ratio N / M, where N is the number of edges between the neighbors of n, and M is the maximum number of edges that could possibly exist between the neighbors of n. The clustering coefficient of a node is always a number between 0 and 1.
The network clustering coefficient is the average of the clustering coefficients for all nodes in the network. Here, nodes with less than two neighbors are assumed to have a clustering coefficient of 0.
Complex Network Parameters
Degree distributions
In undirected networks, the node degree of a node n is the number of edges linked to n. A selfloop of a node is counted like two edges for the node degree [5]. The node degree distribution gives the number of nodes with degree k for k = 0,1,….
In directed networks, the indegree of a node n is the number of incoming edges and the outdegree is the number of outgoing edges. Similar to undirected networks, there are an indegree distribution and an outdegree distribution.
Barabási and Albert used the node degree distribution to distinguish between random (as defined by Erdős and Rényi [7, 3]) and scalefree network topologies [2].
Neighborhood connectivity
The connectivity of a node is the number of its neighbors. The neighborhood connectivity of a node n is defined as the average connectivity of all neighbors of n [10]. The neighborhood connectivity distribution gives the average of the neighborhood connectivities of all nodes n with k neighbors for k = 0,1,…. Figure 3 shows the neighborhood connectivity distribution for the network presented in Figure 1b.
NetworkAnalyzer computes similar parameters for directed networks. In analogy to the in and outdegree, every node n in a directed network has in and outconnectivity. Thus, in directed networks, a node has the following types of neighborhood connectivity:
 only in  the average outconnectivity of all inneighbors of n;
 only out  the average inconnectivity of all outneighbors of n;
 in and out  the average connectivity of all neighbors of n (edge direction is ignored).
Based on the three definitions given above, there are three neighborhood connectivity distributions  "only in", "only out" and "in and out".
If the neighborhood connectivity distribution is a decreasing function in k, edges between low connected and highly connected nodes prevail in the network [10].
Shortest paths
The length of the shortest path between two nodes n and m is L(n,m). The shortest path length distribution gives the number of node pairs (n,m) with L(n,m) = k for k = 1,2,….
The network diameter is the maximum length of shortest paths between two nodes. If a network is disconnected, its diameter is the maximum of all diameters of its connected components.
The network diameter and the shortest path length distribution may indicate smallworld properties of the analyzed network [17].
Clustering coefficients
In undirected networks, the clustering coefficient C_{n} of a node n is defined as C_{n} = 2e_{n}/(k_{n}(k_{n}1)), where k_{n} is the number of neighbors of n and e_{n} is the number of connected pairs between all neighbors of n [17, 2]. In directed networks, the definition is slightly different: C_{n} = e_{n}/(k_{n}(k_{n}1)).
In both cases, the clustering coefficient is a ratio N / M, where N is the number of edges between the neighbors of n, and M is the maximum number of edges that could possibly exist between the neighbors of n. The clustering coefficient of a node is always a number between 0 and 1.
The average clustering coefficient distribution gives the average of the clustering coefficients for all nodes n with k neighbors for k = 2,…. NetworkAnalyzer also computes the network clustering coefficient that is the average of the clustering coefficients for all nodes in the network.
The clustering coefficient of a node is the number of triangles (3loops) that pass through this node, relative to the maximum number of 3loops that could pass through the node.
For example, in Figure 4, there is one triangle that passes through node b (the triangle bcd). The maximum number of triangles that could pass through b is three (in this case, the pairs (a, c) and (a, d) would be connected additionally). This yields a clustering coefficient of C_{b} = 1 / 3.
Ravasz et al. used the average clustering coefficient distribution to identify a modular organization of metabolic networks [13].
Shared neighbors
P(n,m) is the number of partners shared between the nodes n and m, that is, nodes that are neighbors of both n and m. The shared neighbors distribution gives the number of node pairs (n,m) with P(n,m) = k for k = 1,….
If a motif like the one presented in Figure 5 is overrepresented in a network, this can be inferred from the shared neighbors distribution.
Topological coefficients
The topological coefficient [15] T_{n}
of a node n with k_{n} neighbors is computed as follows:
T_{n} = avg ( J(n,m) ) / k_{n}.
Here, J(n,m) is defined for all nodes m that share at least one
neighbor with n. The value J(n,m) is the number of neighbors shared
between the nodes n and m, plus one if there is a direct link between n and
m.
For example, in Figure 6, J(b,c) = J(b,d) = J(b,e) = 2. Therefore, T_{b} = 2 / 3.
The topological coefficient is a relative measure for the extent to which a node shares neighbors with other nodes. NetworkAnalyzer computes the topological coefficients for all nodes with more than one neighbor in the network. Nodes that have one or no neighbors are assigned a topological coefficient of 0 (zero).
The chart of the topological coefficients can be used to estimate the tendency of the nodes in the network to have shared neighbors.
Stress centrality
The stress centrality [4, 14] of a node n is the number of shortest paths passing through n. A node has a high stress if it is traversed by a high number of shortest paths. This parameter is defined only for networks without multiple edges.
The stress centraility distribution gives the number of nodes with stress s for
different values of s. The values for the stress are grouped into bins whose size grows
exponentially by a factor of 10. The bins used for this distribution are {0}; [1, 10);
[10, 100);
...
Betweenness centrality
The betweenness centrality [4] C_{b}(n)
of a node n is computed as follows:
C_{b}(n) = ∑_{s≠n≠t} (σ_{st}
(n) / σ_{st}),
where s and t are nodes in the network different from n,
σ_{st} denotes the number of shortest paths from s to t, and
σ_{st} (n) is the number of shortest paths from s to t
that n lies on.
Betweenness centrality is computed only for networks that do not contain multiple edges. The betweenness value for each node n is normalized by dividing by the number of node pairs excluding n: (N1)(N2)/2, where N is the total number of nodes in the connected component that n belongs to. Thus, the betweenness centrality of each node is a number between 0 and 1.
For example, the betweenness centrality of node b in Figure 7
is computed as follows:
C_{b}(b) =
((σ_{ac}(b) / σ_{ac}) +
(σ_{ad}(b) / σ_{ad}) +
(σ_{ae}(b) / σ_{ae}) +
(σ_{cd}(b) / σ_{cd}) +
(σ_{ce}(b) / σ_{ce}) +
(σ_{de}(b) / σ_{de})) / 6 =
((1 / 1) + (1 / 1) + (2 / 2) + (1 / 2) + 0 + 0) / 6 = 3.5 / 6 ≈ 0.583
The betweenness centrality of a node reflects the amount of control that this node exerts over the interactions of other nodes in the network [19]. This measure favors nodes that join communities (dense subnetworks), rather than nodes that lie inside a community.
NetworkAnlayzer uses the fast algorithm by Brandes [4] for the computation of node betweenness centrality. This algorithm has a complexity of O(NM), N being the number of nodes and M  the number of edges in the network.
Closeness centrality
The closeness centrality [11] C_{c}(n)
of a node n is defined as the reciprocal of the average shortest path length and is computed
as follows:
C_{c}(n) = 1 / avg( L(n,m) ),
where L(n,m) is the length of the shortest path between two nodes
n and m. The closeness centrality of each node is a number between 0 and 1.
NetworkAnalyzer computes the closeness centrality of all nodes and plots it against the number of neighbors. The closeness centrality of isolated nodes is equal to 0.
Closeness centrality is a measure of how fast information spreads from a given node to other reachable nodes in the network [11].
For example, the closeness centrality of node b in Figure 7
is computed as follows:
C_{c}(b) = 1/ ( (L(b, a) + L(b, c)
+ L(b, d) + L(b, e)) / 4) = 4/ (1 + 1 + 1 + 2) = 4/5 = 0.8
Analysis of subset of nodes
An exhaustive topological analysis of very large networks can be a time consuming task. The computation of local parameters for the nodes is significantly faster than the computation of global (pathrelated) parameters. Examples of local parameters are node degree, neighborhood connectivity, topological and clustering coefficients. Betweenness and closeness centralities, as well as stress, are global parameters.
NetworkAnalyzer provides the option "Analyze Subset of Nodes" for computing local parameters for a subset of nodes. If one or more nodes in the network are selected before starting an analysis, only the subnetwork induced by the selected nodes is analyzed. Moreover, only local parameters are computed. Shared neighbors distribution and shortest path lengths distribution, among others, are not displayed in the results.
Batch Analysis
The "Batch Analysis" menu item in NetworkAnalyzer is used to perform topological analysis on all networks stored in specific directory, using all possible interpretations for every network. Batch analysis consist of three simple steps:
 Selecting directories
The user selects the input and output directories. The input directory should contain network files that can be loaded into Cytoscape. NetworkAnalyzer loads consecutively each of the networks it finds in the directory and performs topological analysis on it. Subdirectories of the input directory are not traversed. The output directory is the one that will contain all analysis results after the batch analysis. In order to avoid file overwriting, NetworkAnalyzer requires that the output directory is empty (contains no files) before the batch analysis starts.
 Analysis
NetworkAnalyzer scans the input directory and loads all supported networks into Cytoscape, one at a time. Each loaded network is inspected and then it is analyzed considering all possible interpretations for it. The analysis step is complete after all networks are analyzed. Note that, depending on the number of networks and their sizes, this might be a very timeconsuming step.
 Inspection of results
After the analysis is complete, the button "Show Results" is enabled, and it displays the results dialog. The dialog contains a table of all topological analyses performed. Every row in the results table lists the loaded network, its interpretation and the resulting network statistics file that was saved in the output directory. By clicking on a network name and on statistics file name, the user can load the network and topology analysis results, respectively.
Fitting a Line
NetworkAnalyzer provides another useful feature  fitting a line on the data points of some complex parameters. The method applied is the least squares method for linear regression [18]. NetworkAnalyzer gives the correlation between the given data points and the corresponding points on the fitted line. In addition, the Rsquared value (also known as coefficient of determination) is reported.
Fitting a line can be used to identify linear dependencies between the values of the x and y coordinates in a complex parameter. Figure 8 shows the fitted line on a neighborhood connectivity distribution. The correlation between the data points and corresponding points on the line is approximately 0.969. The Rsquared value is 0.939, giving a relatively high confidence that the underlying model is indeed linear.
Fitting a Power Law
The degree distribution of many biological networks approximates a power law: DD(k) ~ k^{α} for some negative constant α. Several studies have reported similar properties of the average clustering coefficient distribution [13] and the topological coefficients [15].
NetworkAnalyzer can fit a power law to some topological parameters. Please note that NetworkAnalyzer uses the least squares method [18], and only points with positive coordinate values are considered for the fit. This approach fits a line on logarithmized data and may be inappropriate for supporting certain hypotheses [8, 16].
NetworkAnalyzer gives the correlation between the given data points and
the corresponding points on the fitted curve. In addition, the Rsquared value
(also known as coefficient of determination) is reported. This coefficient gives the
proportion of variability in a data set, which is explained by a fitted linear model.
Therefore, the Rsquared value is computed on logarithmized data, where the powerlaw
curve:
y = β x^{α}
is transformed into linear model:
ln y = ln β + α ln x,
as shown in Figure 9.
Attributes
Node Attributes
While iterating over the connected components of a network, NetworkAnalyzer computes the following topological measures for each node n:

AverageShortestPathLength
Average length of a shortest path between n and any other node. If n is an isolated node, the value of this attribute is zero. 
BetweennessCentrality
Betweenness centrality of n as explained in the section Betweenness centrality. 
ClosenessCentrality
Closeness centrality of n as described in the section Closeness centrality. 
ClusteringCoefficient
This numerical attribute stores the clustering coefficient of n, as defined in [2]. Nodes with less than 2 neighbors have a clustering coefficient of zero. 
Degree
The degree of n as explained in the section Degree Distributions. 
Eccentricity
The maximum noninfinite length of a shortest path between n and another node in the network. If n is an isolated node, the value of this attribute is zero. 
IsSingleNode
This boolean attribute indicates if n is an isolated node, that is, if n has no neighbors. 
NeighborhoodConnectivity
The neighborhood connectivity of n as explained in the section Neighborhood Connectivity. 
NumberOfDirectedEdges
This attribute counts the number of directed edges that are connected to n. 
NumberOfUndirectedEdges
This attribute counts the number of undirected edges that are connected to n. 
PartnerOfMultiEdgedNodePairs
This attribute indicates if n is a partner of node pairs with multiple edges. 
Radiality
This attribute is a node centrality index [4] computed by subtracting the average shortest path length of a node n from the diameter of the connected component plus 1. The radiality of each node is divided by the diameter of the connected component. Thus it is a number between 0 and 1. 
SelfLoops
This attribute counts the number of selfloops at n. 
Stress
This attribute counts the number of shortest paths passing through a node. 
TopologicalCoefficient
This numerical attribute stores the topological coefficient of n, as defined in [15]. Nodes with less than 2 neighbors have a topological coefficient of zero.
Edge Attributes
While iterating over the connected components of a network, NetworkAnalyzer computes the following topological measures for each edge e:

EdgeBetweenness
This attribute stores the edge betweenness of each edge normalized by dividing by (M1)(M2), where M is the number of edges in the connected component that the edge belongs to. The edge betweenness of e=(v,w) is defined as the number of shortest paths between two nodes s and t that go through e divided by the total number of shortest paths that go from s to t [12], [19].
Edge betweenness is computed only for networks that do not contain multiple edges. In addition, selfloops are neglected in the calculation.
Visualizing Computed Parameters
For a visual inspection of the results of a topological analysis, NetworkAnalyzer includes two options. On the one hand, the computed parameters can be visualized by mapping them to node/edge size and color. On the other hand, each pair of computed parameters can be plotted in a chart. Both types of visualizations can be applied to the attributes computed by NetworkAnalyzer, as well as to all other numerical attributes assigned to the nodes and edges of the network of interest.
Note that visual mapping is possible only when the computed parameters for the selected
network are stored as node and edge attributes. Make sure the option "Store node / edge
parameters in node / edge attributes" in NetwowkAnalyzer Settings is enabled. Parameters
loaded from a .netstats
file cannot be visualized because the network itself is not
stored in the network statistics file. If, after performing topological analysis, the network is
modified by introducing or removing nodes or edges, it is recommended (and sometimes required)
to run NetworkAnalyzer again before visualizing any parameters.
Map Parameters to Visual Style
The visual mapping is initiated by the Map Parameters to Visual Style dialog, shown on Figure 10. There are two ways of mapping computed parameters.

Map to node / edge size
The computed parameter is mapped to the size of the nodes or edges. Mapping can be straight or inverse, that is, low parameter values can be mapped to small sizes or to large sizes. The smallest node size is set to 10 and the largest one to 100. Regarding edges, size reflects the edge line width and varies between 1 and 8. Refer to the documentation of Cytoscape's VizMapper^{TM} if interested in finetuning the mapping parameters. 
Map to node / edge color
A computed parameter is mapped to the color of the nodes or edges. Two mapping styles are possible  mapping low parameter values to bright colors or to dark colors. By default, the brightest color is green and the darkest color is red. The mapping also uses a middle (intermediate) color, which allows for finegrained perception of differing values through the color gradient. The default middle color is yellow.
The default values for background, bright, middle and dark colors can be changed in the NetworkAnalyzer Settings menu option.
Plot Parameters
The Plot Parameters dialog offers a possibility to plot two parameters against each other. One possible application of such a plot is to observe the correlation of two parameters as shown on Figure 11. The parameters to be plotted can be chosen from two dropdown menus. The Attrbiute 1 menu provides the values for the domain/category axis, and the Attribute 2 menu specifies the values for the range/value axis. The plot is updated each time a different parameter is selected in one of the menus.
Network Modifications
In addition to topological analysis, NetworkAnalyzer offers a useful set of network modifications.
Remove Duplicated Edges
NetworkAnalyzer can remove duplicated edges in a network. After applying this operation to a network, every node pair (a, b) is connected by at most three edges  a directed edge from a to b, a directed edge from b to a, and an undirected edge. For each of these edge types, if there are more than one edge, one is randomly chosen and all others are removed from the network. Any attributes of the considered edges, if present, are ignored. If the option "Ignore edge direction" is selected, all edges are treated as undirected and thus only one edge per node pair is retained.
This option does not affect selfloops in the network. All duplicated selfloops, if any, are retained.
Please note that this option effectively modifies the selected network(s) and the operations performed cannot be undone.
Remove SelfLoops
NetworkAnalyzer can remove selfloops in a network. A selfloop is an edge which connects a node to itself. By applying this operation to a network, all directed and undirected selfloops are removed.
This option does not affect any other edges present in the network.
Please note that this option effectively modifies the selected network(s) and the operations performed cannot be undone.
Network Operations
NetworkAnalyzer can compute the union, intersection and difference of any two networks. The result of such operation is stored as a new network in Cytoscape.
NetworkAnalyzer computes these set operations two networks both on the node and on the edge levels. Note that every network is a pair of a node set and an edge set (N, E). Set operations  union, intersection and difference  are performed on both the node sets and the edge sets of networks.
Connected Components
NetworkAnalyzer can list all its connected components of a disconnected network. The size (number of nodes) of each component is reported. By selecting a specific connected component, it can be exported as a separate network in Cytoscape. This feature allows users to perform topological analysis on the largest connected component of a network.
References
[1]  Barabási, A.L., Albert, R.: Emergence of scaling in random networks. Science 286 (1999) 509512 
[2]  Barabási, A.L., Oltvai, Z.N.: Network biology: understanding the cell's functional organization. Nat Rev Genet 5 (2004) 101113 
[3]  Bollobas, B.: Random graphs. Cambridge University Press, Cambridge (2001), ISBN: 0521809207 
[4]  Brandes, U.: A faster algorithm for betweenness centrality. J Math Sociol 25 (2001) 163177 
[5]  Diestel, R.: Graph theory. SpringerVerlag, Heidelberg (2005), ISBN 3540261826 
[6]  Dong, J., Horvath, S.: Understanding network concepts in modules. BMC Syst Biol 24 (2007) 
[7]  Erdős, P., Renyi, A.: Publ Math Inst Hung Acad Sci 5 (1960) 17 
[8]  Goldstein, M.L., Morris, S.A., Yena, G.G.: Problems with fitting to the powerlaw distribution. European Physical Journal B 41 (2004) 255258 
[9]  Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. SpringerVerlag (2001), ISBN 0387952845 
[10]  Maslov, S., Sneppen, K: Specificity and stability in topology of protein networks. Science 296 (2002) 910913. 
[11]  Newman, M.E.J.: A measure of betweenness centrality based on random walks. arXiv (2003) condmat/0309045 
[12]  Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Soc Networks 27 (2005) 3954 
[13]  Ravasz, E., et al.: Hierarchical organization of modularity in metabolic networks. Science 297 (2002) 15511555 
[14]  Shimbel, A.: Structural parameters of communication networks. Bull Math Biophys 15 (1953) 501507. 
[15]  Stelzl, U., et al.: A human proteinprotein interaction network: a resource for annotating the proteome. Cell 122 (2005) 957968. 
[16]  Tanaka, R., Yi, T.M, Doyle, J.: Some protein interaction data do not exhibit power law statistics. FEBS Letters 579 (2005) 51405144 
[17]  Watts D.J., Strogatz, S.H.: Collective dynamics of 'smallworld' networks. Nature 393 (1998) 440442 
[18]  Weisstein, E. W.: Least Squares Fitting  Power Law. MathWorld  A Wolfram Web Resource. (http://mathworld.wolfram.com/LeastSquaresFittingPowerLaw.html) 
[19]  Yoon, J., Blumer, A., Lee, K.: An algorithm for modularity analysis of directed and
weighted biological networks based on edgebetweenness centrality. Bioinformatics, 22 (2006) 31068 