The variable λ in the forward function of the DifferentialTransformer class is not being used. Is it necessary to pass it? Regarding the initialization of λ, should the following function be used:
def lambda_init_fn(depth):
return 0.8 - 0.6 * math.exp(-0.3 * depth)
Pay now to fund the work behind this issue.
Get updates on progress being made.
Maintainer is rewarded once the issue is completed.
You're funding impactful open source efforts
You want to contribute to this effort
You want to get funding like this too