Cover

What are activation functions ?

Activation functions are the key to Neural Networks (NN). They are the recipe to learn non-linear relationships in data, without them NNs would just be linear in nature. If you know the basics of logistic regression, you’ll understand the statement I made. I am planning to write another blog to cover activation functions from basics, therefore the focus here is just to compare softmax and sigmoid activation functions.

Difference between sigmod and softmax, which one to choose ?

Many times in exam, you’ll have to decide which activation to use. The major one among them is to understand softmax and sigmoid.

Sudo Exam Tip: Remember that most of the times, in the hidden layers relu or tanh activation functions are used. In the output layer, depending on the problem statement, you’ll have to choose between sigmoid or softmax

Softmax Sigmoid
Used in multi-class classification Used in binary classification and multi-label classification
Summation of probabilities of classifications for all the classes (multi-class) is 1 Summation of probabilities is NOT 1
The probabilities are inter-related Independent probabilities i.e. probability of a class is independent of probability of another class. Hence they don’t sum up to 1
One right answer: as the name suggests i.e. soft`max`, the class with maximum probability is the right answer from the model More than one right answer, can choose classes with top five probabilities, for example
Mutually exclusive output Mutually non-exclusive output
Example:
MNIST data is a dataset of images with hand written digits from 0 to 9. An image has only one digit. Therefore, when a model is trained, softmax activation at the output layer is apt. The output is a prediction of only one right answer, i.e. the digit with highest probability
Example:
Chest X-Ray Images can show many diseases at the same time. For example: pneumonia, cardiomegaly, nodule, abscess, etc. Sigmoid activation at the output layer is apt here, as a patient can have multiple diseases at the same time.