example with data from csv file or pandas? #180
Replies: 1 comment
-
Hi Guido, You don't have to use datasets, at all. All layers in Spektral take as input simple TensorFlow tensors, so as long as you know how to create the proper input you can completely ignore the dataset/loader interface. If you don't want to deal with tensors manually, then datasets they are just there to interface whatever data you have with the rest of Spektral easily, they are not meant to be saved to disk. In any case, your CSV should be structured as follows. Let's say that you have a graph with N nodes. Depending on the graph, you can have
Most graphs will have at least the first two. Adjacency matrix CSV
In this case, to create an adjacency matrix from this file you do adjacency = pd.read_csv("path/to/adjacency.csv").values # You might want to convert to a sparse matrix afterwards Note that this is very expensive in terms of memory and that a much better way is to store only the sparse indices where your adjacency matrix is non-zero, i.e., the pairs i, j such that
And to create a sparse adjacency matrix you can do: indices = pd.read_csv("path/to/sparse_adjacency.csv").values
row, col = indices.T
data = np.ones_like(row)
adjacency = scipy.sparse.csr_matrix((data, (row, col)) Now you have the adjacency matrix of the graph. If you want to use it in a model, you might also want to convert it to a sparse Tensor (not needed if you use a Dataset + Loader): from spektral.layers import ops
adjacency = ops.sp_matrix_to_sp_tensor(adjacency) Node attributes CSV
Again, you can read it as: node_attributes = pd.read_csv("path/to/node_attributes.csv").values And that's it, you have the node attributes of the graph. TargetsNote that in most cases you will also have the targets for training your model. Let's just create dummy targets for the sake of example: y = np.random.rand(node_attributes.shape[0], 1) Using the matricesNow that you have Let's say that you want to do node regression (i.e., for each node you want to predict a continuous variable) and you have a Keras model based on Spektral: from spektral.models import GCN
model = GCN(1, output_activation=None) # no activation to do regression If you want to write your own training loop, you only need to know how to call the model: output = model([node_attributes, adjacency]) and from here you can write your own training script (see here). Otherwise, you can use the from spektral.data import Graph, Dataset, SingleLoader Create a custom dataset with your inputs and targets: class MyDataset(Dataset):
def read(self):
return [Graph(x=node_attributes, a=adjacency, y=y)]
dataset = MyDataset() Train the model using a SingleLoader: loader = SingleLoader(dataset)
model.compile("adam", "mse")
model.fit(loader.load(), steps_per_epoch=loader.steps_per_epoch) And that's all. Cheers |
Beta Was this translation helpful? Give feedback.
-
I wonder if you can provide an example with data input in a csv file? I understand the ease of providing tutorials using your custom dataset but it wasn't very easy for me to understand how to use my own data. If I need to create a Spektral Dataset and save it to disk this probably is not a good alternative for me.
Would be great if you can also provide an example on how to do regression instead of classification.
thanks
Guido
Beta Was this translation helpful? Give feedback.
All reactions