Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suspected issue in local_attachment_tensor.py #94

Open
vmgaribay opened this issue Jul 17, 2024 · 6 comments
Open

Suspected issue in local_attachment_tensor.py #94

vmgaribay opened this issue Jul 17, 2024 · 6 comments
Assignees

Comments

@vmgaribay
Copy link

When I run the full step, the number of links created was not the same as the number successfully deleted. I had a similar issue in local_attachment_basic_homophily.py and it was due to me only checking for duplicate edge requests in one direction. I solved this by sorting the pairs and then taking the unique pairs. The two files do not share the same exact flow, but maybe it is a similar problem with either duplicate or self-edges.
We should probably check plain local_attachment, as well, if it is our intention to release it.

@SarahAlidoost
Copy link
Member

related issue #73

@SarahAlidoost SarahAlidoost self-assigned this Aug 15, 2024
@SarahAlidoost
Copy link
Member

@vmgaribay I noticed your fixes to local_attachment_basic_homophily.py, while the method is "size" in link_deletion, see here. I'm trying to understand the issue. Could you please help clarify the following points?

  1. When you say “the number of links created was not the same as the number successfully deleted”, how do you determine “the number of links created” and “the number successfully deleted”? I assume that you inspected model_graph.number_of_edges() after each of the following steps: initialize_model(), local_attachment_basic_homophily(), link_deletion(), and finally model.run(). Is that correct? Could you clarify which of these represents 'the number of links created' and 'the number successfully deleted'?

  2. In this commit, when you fixed local_attachment_basic_homophily.py, the method = "size" was used in link_deletion. Did you check if this issue—where 'the number of links created was not the same as the number successfully deleted'—occurs with other del_method as well?

  3. In the development branch, method = "size" has been replaced with method = params['del_method']. When running the model with local_attachment_basic_homophily.py and the default 'del_method' (which is 'probability'), the model_graph.number_of_edges() ends up being 0. Is this expected behavior? Here code to test this:

        model = dgl_ptm.PovertyTrapModel(model_identifier='my_model')
        model.set_model_parameters(steering_parameters={'step_type':'custom'})
        model.initialize_model()
        print(f"Number of edges start: {model.model_graph.number_of_edges()}")
        model.run()
        print(f"Number of edges after run: {model.model_graph.number_of_edges()}")
  1. The local_attachment_tensor isn't used in the code. Shall we check this issue for both local_attachment_tensor and local_attachment?

@vmgaribay
Copy link
Author

  1. Yes, in the custom step in step.py, the number of edged is queried before and after link creation (2 processes local and global) to determine the threshold supplied to link_deletion(). Despite this, the number of edges increases slightly.
  2. I did not check other del_methods.
  3. If the probability is set too high, then it is possible that it would lead to a fully unconnected graph. I would say that the behavior is not necessarily unexpected, but it is undesired in the default case.
  4. I think it was just a complication with local_attachment_tensor as a feature of the functions used, but at this point in time I cannot remember.

I hope these were appropriate answers, but if anything needs clarification, please reach out.

@meiertgrootes
Copy link
Member

check the implementation of local_attachment_tensor for duplication of edges which give rise to the link deletion having removed an edge and then subsequently retryiing and failiing for the already removed edge.

@SarahAlidoost
Copy link
Member

SarahAlidoost commented Aug 22, 2024

@vmgaribay and @meiertgrootes I could reproduce the original problem: when step_type=custom and del_method=size, the agent_graph.number_of_edges() after the local_attachment_homophily does not match the value after the link_deletion in step function. The is because local_attachment_homophily produces repeated nodes in the node_pairs calculation.

Reproducing this problem is not easy due to the random behavior of dgl.sampling.sample_neighbors used in local_attachment_homophily. Note that, the problem only occurs when del_method=size and del_threshold is calculated as threshold = int((agent_graph.number_of_edges() - start_edges) / 2). See the documentation of link_deletion.

After applying the fixes by @vmgaribay here, the original problem was resolved. However, while fixing the problem, a new bug is introduced by hardcoding the threshold, see here. As a result, users can no longer modify the threshold or switch to a different del_method, such as probability. I've reported this in #112.

Regarding local_attachment or local_attachment_tensor when del_method=size, I found that in these cases, the number of added edges equals the number of deleted edges, meaning there are no repeated nodes in node_pairs. The original problem does not occur when threshold = int((agent_graph.number_of_edges() - start_edges) / 2) because n_FoF_links = 1 and this is hardcoded in the step function, see here.

@meiertgrootes
Copy link
Member

hi @SarahAlidoost @vmgaribay,
based on your results I looked at local_attachment_tensors algorithm again.
The algorithm starts by selecting elements from the adjacency matrix, i.e directed edges, at random. While it is robust against the selection of the same edge in both directions in the case of two edges with a one common central node, and the edges both being directed towards that node (e.g. matrix element (0,3) and (10,3) being selected), depending on the neighbour constellations of nodes 0 and 10 possible FoF suggestions are 0->10 AND 10->0. if both are then subsequently selected edges can be created in duplicate.
However, as we only use add_edges for both directions and no calls to to_simple I would have expected the duplicates to persist, and in turn, this not giving rise to the issue @vmgaribay raised.

I believe a fix of this bug is simple, but would still be very interested if you managed to reproduce it @SarahAlidoost

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants