Review Comment:
I would like to thank the authors for their detailed responses and revisions. Some of my comments are completely resolved. However, I found that several of my comments were incompletely addressed and that the paper would benefit from further work on a few of the issues I highlighted. I am particularly worried about the clarity of the analysis of the embedding results (see my follow up in point 3.4 for further detail). Specifically, I would like to see further edits in relation to the following comments raised in my first review,
1. In Section 2:
1.1 Thank you for adding the suggested material. The sentence "These are much more commonly applied to the [...]" does not make sense in your added text since the methods you added have not been introduced at that point. I suggest to say instead: "Spectral embedding techniques, such as the spectral clustering algorithm, are much more commonly applied to the [...]"
1.2 Thanks for your clarification. I believe you should add it to the paper. As it stands the formulation "Nevertheless, unsupervised graph embedding is difficult to expand and [...]" is still imprecise.
2. In Section 3:
2.5 You might have misunderstood my comment here. What I was trying to ask for is whether you could add a mathematical definition of the operation you refer to as "splicing" to the paper. I am unsure what you exactly mean by this term, are you referring to the concatenation operation?
Furthermore, I am happy with your explanation that your smoothness claims are based on experimental results. It would be great if you could add this to the text. As it stands the claim "the dual GCN representation aggregate the higher order network structure, hence the trained GCN model will have an embedding that is too smooth [...]" is unsubstantiated.
2.6 I agree with your response ("GCNs can be successful used for datasets containing structural information"), it would be great if you could also update the paper accordingly. In the paper you still write "GCN is not suitable for graphs with only structural information."
3. In Section 4:
3.2 Thanks for your response. I still think that it is necessary that you provide the hyperparameters of your proposed models to your publication. The relevant information includes the hidden dimensions, number of layers, etc., as well as your chosen optimiser, hyper parameter optimisation scheme, training epochs and learning rate of your proposed model. Without these details it is difficult to clearly understand the behaviour of your models and impossible to reproduce your results.
3.4 I disagree with the statement that the embedding output of your model cannot be quantitatively analysed. You could consider using a clustering algorithm and report the Adjusted Mutual Information, Adjusted Rand Index or another cluster evaluation metric. You could also opt to report counts of different classes which fall within your chosen boxes of observation. Any of these quantitative methods could help provide evidence for the claims you make in the analysis of Figures 5, 6 and 7. As it stands I find it difficult to agree with your analysis in Section 4.6 based on the figures alone. You should furthermore add to the text that you choose the highlighted boxes by observation as you state in your review.
Also your claim about the separatedness of classes 10 and 3 in Figure 7 is not verfiable since the figures lack clarity. The marker size is too large, so it is unclear how many datapoints are hidden by an overlap of the markers, the legends are unordered, and in the case of Figure 7, several classes (such as class 1 and 3) have the same colour markers.
I think it is necessary to both quantitatively assess the output of your embedding method and increase the clarity of your figures to substantiate the conclusions you draw in this paper.
4. Minor Issues:
4.2 The references in Section 2 still don't seem to correspond to the numbering of the reference list. E.g., you cite the PCA paper in the following sentence "In unsupervised learning (e.g. such as DeepWalk, node2vec, and struc2vec)[10]" and the MDS reference is given here: "comprehensive features are extracted, which are closely related to the prediction network accuracy[11]." The given references in Section 2 are surprising throughout. So, it would be great if you could carefully review the correspondence of the numbering in the text and the reference list.
4.6 The graph definition in Equation (7) is still inconsistent with previous definitions including only the node set and the adjacency matrix. It would be good if you could reformulate Equation 7 to highlight the fact that the graph structure (as previously defined in your text) and the additional features and labels are passed to the GCN in separate mathematical objects.
4.11 In the third line after Equation (5) there is still a superscript written in the subscript. And as mentioned in my first review, I think you should refer to the operator as a normalised adjacency instead of Laplacian matrix.
4.13 The updated introduction of Equation (5) is not correct. Equation (5) does not contain the definition of the mathematical convolution operation. But rather the equation of a graph convolutional layer. This should be corrected.
4.16 Thanks for updating Table 2. In Table 2 you state that the step number for the Traffic datasets is 80. But the accompanying text says "[...] the number of random walk steps was set to 15 [...]." I am not sure if these two statements are in direct contradiction. It would be great if you could clarify which of these parameters was used.
