Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The domain adaptation phase seems not reproducible #7

Open
Howard-Hsiao opened this issue Jan 31, 2024 · 0 comments
Open

The domain adaptation phase seems not reproducible #7

Howard-Hsiao opened this issue Jan 31, 2024 · 0 comments

Comments

@Howard-Hsiao
Copy link

Hello,
I recently tried your program, but during my experimentation, I found that the results of the program did not meet my expectations. Following the hyperparameter configuration you provided for RHD->H3D_crop protocol, I was only able to achieve a best score of 0.689 in the H3D_crop, target domain. Because it appears that it might not be functioning as expected. I was wondering if there are any hyperparameter configurations that may have been overlooked in the README.md, and if you could provide some guidance on how to address this issue. I appreciate your assistance in advance.

Initially, there were issues with the train_sfda.py script. I made some code revisions, and I will discuss these changes in the following sections.

1. restructure the optmizer.step part

       with torch.cuda.amp.autocast():

            y_t_sr = model_sr(x_t_stu)
            y_t_in = model_in(x_t_stu)

            loss_ft = criterion(y_t_sr, y_t_in)

            loss_res = res_criterion(y_t_sr, y_t_in)
        
            loss_in = loss_ft + 0.7 * loss_res

        scaler_sr.scale(loss_ft).backward()
        scaler_sr.step(sr_optimizer)
        scaler_sr.update()

        scaler_in.scale(loss_in).backward()
        scaler_in.step(in_optimizer)
        scaler_in.update()

to

        with torch.cuda.amp.autocast():
            y_t_sr = model_sr(x_t_stu)
            with torch.no_grad():
                y_t_in = model_in(x_t_stu)
            loss_ft = criterion(y_t_sr, y_t_in)

        scaler_sr.scale(loss_ft).backward()
        scaler_sr.step(sr_optimizer)
        scaler_sr.update()

        with torch.cuda.amp.autocast():
            with torch.no_grad():
                y_t_sr = model_sr(x_t_stu)
            y_t_in = model_in(x_t_stu)

            loss_ft = criterion(y_t_sr, y_t_in)
            loss_res = res_criterion(y_t_sr, y_t_in)
            loss_in = loss_ft + 0.7 * loss_res

        loss_in.backward()
        in_optimizer.step()

If I don't adjust the code, I would encounter a series of errors, such as:

RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.

after I change the code to scaler_sr.scale(loss_ft).backward(retain_graph=True)

Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [256]] is at version 37; expected version 36 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).```

2. im_criterion

I modified the code to use loss_im = im_criterion(y_t_tg, -1) instead of loss_im = im_criterion(y_t_in, y_t_tg) in an attempt to align with the paper.

3. CST_Loss

Modify the code in order to make the loss a scaler so that I can use loss.backward().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant