The problem with the tanh and sigmoid activation functions is that the derivative is near zero in many regions. This leads to a problem called "vanishing gradients," where the weights and biases of a neural network are updated very little during training, making the learning process extremely slow. This is especially problematic in deep neural networks, where early layers can essentially stop learning.

Question

Knowee AI · Accepted Answer

The problem with the tanh and sigmoid activation functions is that the derivative is near zero in many regions. This leads to a problem called "vanishing gradients," where the weights and biases of a neural network are updated very little during training, making the learning process extremely slow. This is especially problematic in deep neural networks, where early layers can essentially stop learning.

What is the problem with the tanh and sigmoid activation function?1 pointThey are discontinuous functionsYou can't take the derivativeThe derivative is near zero in many regionsThey are periodic functions

Question

Solution

Similar Questions

Upgrade your grade with Knowee