-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ML setup #6
Comments
Several questions:
|
More questions:
|
|
model update: class GCNN_2G(nn.Module):
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Machine Learning setup and training issues
In this issue I will update the current setup for training, results and problems I encounter to converge a good solution.
Data
The Data used comes from a 24h (1h as output resolution) simulation over the whole arctic.
I am focusing on a small region (400km radius) in the central arctic. The goal is to predict the next position of each vertex.
For training I use all the vertexs at time 1,3,5,7,9,11 and for validation I use the same area at time 12,14,16.
Representation and input
As representation of the mesh I use two graphs, one containing vertex information and another using element information.
The input is composed of a central vertex with all neighbouring elements/vertex around it.
As input features for the vertex graphs I use the position x,y and forcing fields (wind,ocean). For the element-graphs I use position x,y, Thickness and Concentration. All inputs are standarized.
Model
The model proccess both graphs using GCNN and the combine both feature vectors to predict the velocity of the vertex
'class GCNN_2G(nn.Module):
Loss Function
Custom loss function to account for velocity angle and position. Can be parametrized using A,B,C coefficients.
class CustomIceLoos(nn.Module):
def init(self, A=1,B=0,C=0,step=1,d_time=3600):
super(CustomIceLoos, self).init()
self.mae = nn.L1Loss()
self.A = A
self.B = B
self.C = C
self.step = step
self.d_time = d_time
'
Training and results
Training dynamics are not bad and there is no overfitting over time (using these data splits). After standarizing the features I had to apply gradient clipping to 1 in order to stabilize the gradient, otherwise it exploded. Typical training curves are like this one (notice log scale):
Results are in the order of ~65 RMSE
Problems
One of the main problems is that using gpu in training doesnt speed up the process even using large batches (128 or 512)
The text was updated successfully, but these errors were encountered: